> The author wants the "worse" sort, one based on ASCII/Unicode codepoints, with...

_9ptr · 2025-09-28T17:59:55 1759082395

And where do you sort the letter ä? (After a is correct in German, but I think Swedish does it differently.)

dvdkon · 2025-09-28T18:15:58 1759083358

This feels like the right moment to mention "ch", which is considered a letter in orthodox Czech, sorted between "h" and "i". The problem is, you can't reliably distinguish between "ch"-the-letter and "ch" as just "c" and "h" combined, which are present in loan words but also some original Czech compound words.

So if you're doing it "properly", sorting strings in Czech involves understanding the etymology of every word.

bmn__ · 2025-09-29T11:34:01 1759145641

What a headache! I'm glad that the relevant standard ČSN 97 6030 does not demand analysis of compounds or knowledge of etymology.

jcynix · 2025-09-28T18:32:30 1759084350

That's why we have all this LC_* stuff in Linux, which you can configure to your needs:

  export  LC_MEASUREMENT="de_DE"
  export  LC_MONETARY="de_DE" 
  export  LC_PAPER="de_DE"                             
  export LC_CTYPE=de_DE.UTF-8  
  export LC_MESSAGES="en_US.UTF-8"        
  export LC_RESPONSE="en_US.UTF-8"  
  export LC_TIME=en_US.UTF-8

Mix in your Swedish or Swaheli, maybe even the Vatican State:

   e.g. de_DE, sw_TZ, it_VA (not guaranteed ;-).

ongy · 2025-09-29T08:15:00 1759133700

> export LC_TIME=en_US.UTF-8

Why would you do this to yourself?

jcynix · 2025-09-29T12:58:31 1759150711

Why? For example to not have diacritics in month names? Take them as examples as you can easily add them to a shell script to make in work the way you want.

ongy · 2025-10-07T16:55:39 1759856139

But you get

* 12h time

* Sunday start of week

* Silly pyramid mm/dd/yyyy

chongli · 2025-09-29T03:22:12 1759116132

How does this work if you're a multi-lingual person and you have files with names in different languages?

jcynix · 2025-09-29T13:27:45 1759152465

I'm multi-lingual but try to separate business stuff for example (multi-lingual) from private stuff (mostly one language), so clashes between languages rarely happen.

But if it gets complicated I'll usually resort to Perl scripts to take care of pesky details. Sorting an associative array where the key is a string in unified form and the value is the multi-lingual target is rather easy in a script language which one is fluent in.

chithanh · 2025-09-29T06:46:21 1759128381

The sorting order is only defined between strings of the same locale, not between strings of different locales.

You can specify the sorting order per command like

LC_COLLATE="tr_TR.utf8" ls

if it differs from your system or user locale.

An alternative is to first transliterate the strings to ASCII and then sort them (but this does not preserve the sorting order of non-latin scripts).

1718627440 · 2025-10-01T20:40:03 1759351203

You could alias cd to a shell script that sets the env based on the location.

sebtron · 2025-09-28T20:58:21 1759093101

> I want the author's opinion on how caplital and lowercase letters should be sorted. Do they follow strict ASCII/Unicode codepoints, or do they normalize into actual alphabetical order and sort upper/lower within each letter?

I prefer the strict ASCII / Unicode sorting (all capitals first, then all lowercase).

jowea · 2025-09-28T17:50:52 1759081852

Asciibetical sorting