coreutils icon indicating copy to clipboard operation
coreutils copied to clipboard

nl: output number separator as bytes

Open cakebaker opened this issue 3 weeks ago • 5 comments

Currently, we output the number separator as a lossy string:

$ printf "a\nb" | cargo run -q nl --number-separator=$'\xFF' | hexdump -C
00000000  20 20 20 20 20 31 ef bf  bd 61 0a 20 20 20 20 20  |     1...a.     |
00000010  32 ef bf bd 62 0a                                 |2...b.|
00000016

Whereas GNU nl doesn't:

$ printf "a\nb" | nl --number-separator=$'\xFF' | hexdump -C
00000000  20 20 20 20 20 31 ff 61  0a 20 20 20 20 20 32 ff  |     1.a.     2.|
00000010  62 0a                                             |b.|
00000012

This PR fixes the issue.

Update: The PR also fixes the same issue for the content. Our current output:

$ printf $'\xFF' | cargo run -q nl | hexdump -C
00000000  20 20 20 20 20 31 09 ef  bf bd 0a                 |     1.....|
0000000b 

The output of GNU nl:

$ printf $'\xFF' | nl | hexdump -C
00000000  20 20 20 20 20 31 09 ff  0a                       |     1...|
00000009

cakebaker avatar Dec 08 '25 10:12 cakebaker

CodSpeed Performance Report

Merging #9601 will improve performances by 66.23%

Comparing cakebaker:nl_fix_non_utf8_number_separator (db52eac) with main (aaa0610)

Summary

⚡ 2 improvements
✅ 125 untouched
⏩ 6 skipped[^skipped]

Benchmarks breakdown

Benchmark BASE HEAD Change
nl_large_file[10] 100.9 ms 61.2 ms +65.03%
nl_many_lines[100000] 79.8 ms 48 ms +66.23%
[^skipped]: 6 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

codspeed-hq[bot] avatar Dec 08 '25 11:12 codspeed-hq[bot]

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)

github-actions[bot] avatar Dec 08 '25 11:12 github-actions[bot]

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/cksum/b2sum is no longer failing!

github-actions[bot] avatar Dec 08 '25 14:12 github-actions[bot]

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Congrats! The gnu test tests/cksum/b2sum is no longer failing!

github-actions[bot] avatar Dec 08 '25 15:12 github-actions[bot]

I was hoping we could create a macro that is similar to the new_ucmd that allows us to validate using the args and and then make sure that the stderr, stdout and error code match with the GNU implementation, I believe that would have highlighted this issue earlier right?

ChrisDryden avatar Dec 10 '25 18:12 ChrisDryden

Superseded by https://github.com/uutils/coreutils/pull/9673

cakebaker avatar Dec 16 '25 14:12 cakebaker