> The author wants the "worse" sort, one based on ASCII/Unicode codepoints, without any intelligence for numbers that 99% of GUI users want.
I want the author's opinion on how caplital and lowercase letters should be sorted. Do they follow strict ASCII/Unicode codepoints, or do they normalize into actual alphabetical order and sort upper/lower within each letter?
This feels like the right moment to mention "ch", which is considered a letter in orthodox Czech, sorted between "h" and "i". The problem is, you can't reliably distinguish between "ch"-the-letter and "ch" as just "c" and "h" combined, which are present in loan words but also some original Czech compound words.
So if you're doing it "properly", sorting strings in Czech involves understanding the etymology of every word.
Why? For example to not have diacritics in month names? Take them as examples as you can easily add them to a shell script to make in work the way you want.
I'm multi-lingual but try to separate business stuff for example (multi-lingual) from private stuff (mostly one language), so clashes between languages rarely happen.
But if it gets complicated I'll usually resort to Perl scripts to take care of pesky details. Sorting an associative array where the key is a string in unified form and the value is the multi-lingual target is rather easy in a script language which one is fluent in.
> I want the author's opinion on how caplital and lowercase letters should be sorted. Do they follow strict ASCII/Unicode codepoints, or do they normalize into actual alphabetical order and sort upper/lower within each letter?
I prefer the strict ASCII / Unicode sorting (all capitals first, then all lowercase).
I want the author's opinion on how caplital and lowercase letters should be sorted. Do they follow strict ASCII/Unicode codepoints, or do they normalize into actual alphabetical order and sort upper/lower within each letter?