nl: output number separator as bytes
Currently, we output the number separator as a lossy string:
$ printf "a\nb" | cargo run -q nl --number-separator=$'\xFF' | hexdump -C
00000000 20 20 20 20 20 31 ef bf bd 61 0a 20 20 20 20 20 | 1...a. |
00000010 32 ef bf bd 62 0a |2...b.|
00000016
Whereas GNU nl doesn't:
$ printf "a\nb" | nl --number-separator=$'\xFF' | hexdump -C
00000000 20 20 20 20 20 31 ff 61 0a 20 20 20 20 20 32 ff | 1.a. 2.|
00000010 62 0a |b.|
00000012
This PR fixes the issue.
Update: The PR also fixes the same issue for the content. Our current output:
$ printf $'\xFF' | cargo run -q nl | hexdump -C
00000000 20 20 20 20 20 31 09 ef bf bd 0a | 1.....|
0000000b
The output of GNU nl:
$ printf $'\xFF' | nl | hexdump -C
00000000 20 20 20 20 20 31 09 ff 0a | 1...|
00000009
CodSpeed Performance Report
Merging #9601 will improve performances by 66.23%
Comparing cakebaker:nl_fix_non_utf8_number_separator (db52eac) with main (aaa0610)
Summary
⚡ 2 improvements
✅ 125 untouched
⏩ 6 skipped[^skipped]
Benchmarks breakdown
| Benchmark | BASE |
HEAD |
Change | |
|---|---|---|---|---|
| ⚡ | nl_large_file[10] |
100.9 ms | 61.2 ms | +65.03% |
| ⚡ | nl_many_lines[100000] |
79.8 ms | 48 ms | +66.23% |
| [^skipped]: 6 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. |
GNU testsuite comparison:
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
GNU testsuite comparison:
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/cksum/b2sum is no longer failing!
GNU testsuite comparison:
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/cksum/b2sum is no longer failing!
I was hoping we could create a macro that is similar to the new_ucmd that allows us to validate using the args and and then make sure that the stderr, stdout and error code match with the GNU implementation, I believe that would have highlighted this issue earlier right?
Superseded by https://github.com/uutils/coreutils/pull/9673