My problem is that one entry has accents in it, especially the first letter:
@inproceedings{vsevvcikova2006automated,
author = {{\v{S}}ev{\v{c}}{\'{\i}}kov{\'a}, Hana and Borning,
Alan and Socha, David and Bleek, Wolf-Gideon},
title = {Automated testing of stochastic systems: A
statistically grounded approach},
year = 2006,
booktitle = {Proceedings of the 2006 international symposium on
Software testing and analysis},
organization = {ACM},
pages = {215--224},}
In the bibliography this is sorted last, even though I have entries that come after the letter S. How can I fix this?
Just to clarify what my preferred behavior would be for sorting:
given a string in Unicode,
normalize it,
strip diacritical marks,
fold case,
compare characterwise using Unicode code points.
But note that I only speak languages that use derivatives of the Latin alphabet, people who are familiar with other alphabets should chime in and say if this is acceptable.
I also have bibliographies in which Latin letters have diacritics, due to transliteration of the names of authors from other languages. This is also an issue for me, in the same way described by the OP.
For now, as a workaround, I am doing what @PgBiel has mentioned, which is to sort them by a manually-entered key which does not have the diacritics/accents. In one case (a relatively small bibliography), I gave up and just created the bibliography by hand.
The general notion makes sense - we want correct alphabetical ordering. But which ordering is correct depends on language, not just script. Your example algorithm for order would be incorrect for a German and Swedish, and German and Swedish don’t agree with each other how to sort Ö, for example, they each have different rules…
Would not an option to simply ignore diacritics solve the majority of such cases and requests? This is, I believe, how LaTeX and even many pieces of bibliographic software handle this. This would then leave the types of language-based cases to still be manually ordered through workarounds, until such day that all languages can be accounted for (or declared) in the software’s algorithm for ordering!