coreutils
coreutils copied to clipboard
sort: version_cmp should keep the trailing 0 for comparison
Fix misc/sort-version.sh
This is the demo that I did at fosdem https://www.youtube.com/watch?v=90Q5N1qT7BQ
I wonder if this really addresses the core of the issue. I think the problem is with stable vs unstable sorting:
# Without --stable, 04 always comes before 4
❯ echo 04\n4 | sort --sort=version
04
4
❯ echo 4\n04 | sort --sort=version
04
4
# With --stable, the order is preserved
❯ echo 04\n4 | sort --sort=version --stable
4
04
❯ echo 4\n04 | sort --sort=version --stable
04
4
I think ls then uses an unstable sort, so that the it looks like it's considering the zeros to be significant. I wonder where this behaviour comes from because it does seem to always put 04 before 4 even with many elements. If the sort was truly unstable I'd expect more random results.
❯ echo 4\n04\n4\n04\n04\n04\n4\n4\n04\n04 | sort --sort=version
04
04
04
04
04
04
4
4
4
4
@tertsdiepraam That's weird. Doesn't that imply that 04 and 4 are both equal and non-equal depending on whether using stable sorting?
By the way, with an older GNU sort version 04 always comes first, so this did change some time ago:
$ echo 04\n4 | sort --sort=version --stable
04
4
$ sort --version
sort (GNU coreutils) 8.32
...
@timvisee That's right. I think this part of the source code is to blame:
https://github.com/coreutils/coreutils/blob/5450c7f8d33f7a9ee10a2700cad9ccc6cec3626e/src/sort.c#L2822-L2829
It seems to fall back on a default order if the specified sort compares equal and --stable is not passed. Not sure why they have this behaviour.
And my patch doesn't even work ;)
-string start 5.04.0 end of str
string start 5.5.0 end of str
string start 5.6.0 end of str
string start 5.7.0 end of str
string start 5.8.0 end of str
string start 5.9.0 end of str
+string start 5.04.0 end of str
string start 5.10.0 end of str
Found the commit: https://github.com/coreutils/coreutils/commit/d8047ae86d5418782db7ec906c10e1af4f129997#diff-e0705db8518514c907d220d1879e28500a0fb802065a65d471ba9520235136ea
Seems like it was intended behaviour to ensure a total order. Funnily enough, it seems to have been introduced based on a bug report by @miDeb which he filed while working on uutils sort, so we brought this upon ourselves 😄
- @miDeb's PR: https://github.com/uutils/coreutils/pull/2462
- The bug report: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=49239
ah ah, it would have been fun to share that in the talk at FOSDEM :)