SortBy icon indicating copy to clipboard operation
SortBy copied to clipboard

Natural Order - Should a Postfix Character Affect Numerical Sort Order?

Open gokieks opened this issue 3 years ago • 4 comments

If I run natural order sort on this set {1,1-1,1-2}, I get the expected sorted order of:

1 1-1 1-2

However, if I do it on the set {1a, 1-1a, 1-2a}, then the sorted order becomes: 1-1a 1-2a 1a

To me at least, I think the natural human sort order for these two sets should be the same, with 1/1a first. Am I alone in this?

gokieks avatar Dec 02 '22 21:12 gokieks

Windows explorer uses natural sort with following results. So, yes, you are alone.

grafik

deathaxe avatar Jan 03 '25 08:01 deathaxe

I didn't ask how Windows Explorer does it, nor do I hold Windows Explorer (or Finder, or whatever) to be the holder of the One True Sorting Algorithm. What I asked was how a HUMAN would sort it, or put another way, what order would make the most sense to a human, and I absolutely guarantee that there will be more people than not who would say that "1a" should come before "1-1a" in a sorted list.

But thanks for the pithy comment 2 years later I guess.

gokieks avatar Jan 03 '25 17:01 gokieks

The algorithm is...

"In computing, natural sort order (or natural sorting) is the ordering of strings in alphabetical order, except that multi-digit numbers are treated atomically, i.e., as if they were a single character."

And this is what this plugin does, as well as provided prominent example of Windows Explorer.

Original natsort doesn't include separator chars. Some libraries treat - optionally as part of a number to support negative values.

Otherwise strings are sorted by their hex code, which causes - to be smaller than a.

Things become more interesting beyond ASCII charset.

deathaxe avatar Jan 03 '25 18:01 deathaxe

You (and the Wikipedia definition of natural sort, which I would not actually consider to be the definitive/universal one) is making my point - putting "1-1a" before "1a" is not actually alphabetical. Instead, it is predicated on sorting "-" before "a", which is based on their ASCII codes. And considering the whole notion of natural sort is to "fix" issues stemming from ASCII sort (e.g. "a100" coming before "a2") to have it make more sense to a human, that's not a good argument for why it would be the "correct" (as in, making most sense to a human) sort order.

gokieks avatar Jan 03 '25 19:01 gokieks