coreutils
coreutils copied to clipboard
printf: trim leading whitespace when parsing numeric values
Trim leading whitespace from numeric input to printf.
This is what seq does here. Note that the rust docs for trim_start state the following:
‘Whitespace’ is defined according to the terms of the Unicode Derived Core Property White_Space, which includes newlines.
So technically the following assert would pass
assert_eq!(Ok(-2), ParsedNumber::parse_i64("\n\t -0x2"));
It looks like this does not currently impact printf or seq, but is probably something to be aware of.
$ cargo run printf "%f\n" " \n 0x1"
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.11s
Running `target/debug/coreutils printf '%f\n' ' \n 0x1'`
printf: '\ \\n\ 0x1': expected a numeric value
0.000000
Fixes: #7505
-
That change should actually change
printfbehaviour (seqcurrently uses totally different parsing code, I'm trying to unify them...). Not sure why you don't see any change, but you really want to test thatcargo run printf "%f\n" " 0x0"works. And maybe add an end-to-end test intest_printf.rstoo. -
You really only want to trim spaces only, not other characters. At least that's what GNU coreutils seems to do: (I didn't test other unicode spaces ,-P):
$ env printf "%f\n" "\t0x1"
printf: ‘\\t0x1’: expected a numeric value
0.000000
$ env printf "%f\n" "\n0x1"
printf: ‘\\n0x1’: expected a numeric value
0.000000
$ env printf "%f\n" " 0x1"
1.000000
- That change should actually change
printfbehaviour (seqcurrently uses totally different parsing code, I'm trying to unify them...). Not sure why you don't see any change, but you really want to test thatcargo run printf "%f\n" " 0x0"works.
Sorry, my PR text was unclear. This PR does change printf behavior and fixes the issue, but I just wanted to make sure reviewers were aware that technically ParsedNumber::parse_<some integer> would trim all whitespace. So there is a semantic question there of is rejecting newlines the responsibility of parse_<some integer> or the responsibility of the caller etc? printf already throws errors for other whitespace like \n and \t.
And maybe add an end-to-end test in
test_printf.rstoo.
Yup! Good suggestion.
- You really only want to trim spaces only, not other characters. At least that's what GNU coreutils seems to do: (I didn't test other unicode spaces ,-P):
$ env printf "%f\n" "\t0x1" printf: ‘\\t0x1’: expected a numeric value 0.000000 $ env printf "%f\n" "\n0x1" printf: ‘\\n0x1’: expected a numeric value 0.000000 $ env printf "%f\n" " 0x1" 1.000000
👍 awesome. The current patch ends up with the same output, but there is still the question of should ParsedNumber::parse also reject non ascii 0x20 whitespace.
- That change should actually change
printfbehaviour (seqcurrently uses totally different parsing code, I'm trying to unify them...). Not sure why you don't see any change, but you really want to test thatcargo run printf "%f\n" " 0x0"works.Sorry, my PR text was unclear. This PR does change
printfbehavior and fixes the issue, but I just wanted to make sure reviewers were aware that technicallyParsedNumber::parse_<some integer>would trim all whitespace. So there is a semantic question there of is rejecting newlines the responsibility ofparse_<some integer>or the responsibility of the caller etc?printfalready throws errors for other whitespace like\nand\t.
Oh! I misunderstood yes ,-) Do you know what code is rejecting non-whitespace leading characters in printf? It's not super obvious from a very quick glance I had...
I feel I'd rather have the logic in the common parse function, seq will soon use that too, and I think we'll want to convert timeout to use the common parsing code too (#7475)... But if the rejection happens in some other shared code, maybe we can argue.
And maybe add an end-to-end test in
test_printf.rstoo.Yup! Good suggestion.
- You really only want to trim spaces only, not other characters. At least that's what GNU coreutils seems to do: (I didn't test other unicode spaces ,-P):
$ env printf "%f\n" "\t0x1" printf: ‘\\t0x1’: expected a numeric value 0.000000 $ env printf "%f\n" "\n0x1" printf: ‘\\n0x1’: expected a numeric value 0.000000 $ env printf "%f\n" " 0x1" 1.000000👍 awesome. The current patch ends up with the same output, but there is still the question of should
ParsedNumber::parsealso reject non ascii 0x20 whitespace.
Great ,-)
Highly surprising, but seq and printf should actually both ignore all real whitespace :exploding_head:
Did some digging and ran some examples with GNU coreutils:
$ seq 1 " $(printf '\n\t 0x2')"
1
2
$ printf "%f\n" "$(printf '\n\t 0x2')"
2.000000
I've learned something new :laughing: I did not expect this to work.
Escape sequences are another story:
$ printf "%f\n" "\t\n 0x5"
-bash: printf: \t\n 0x5: invalid number
0.000000
$ seq 1 "\t\n 0x2"
seq: invalid floating point argument: ‘\\t\\n 0x2’
Try 'seq --help' for more information.
So I'll update the tests with some additions that cover more real whitespace types and add some tests to test_printf.rs.
GNU testsuite comparison:
GNU test failed: timeout/timeout.log. timeout/timeout.log is passing on 'main'. Maybe you have to rebase?
GNU testsuite comparison:
GNU test failed: misc/stdbuf.log. misc/stdbuf.log is passing on 'main'. Maybe you have to rebase?
GNU test failed: timeout/timeout.log. timeout/timeout.log is passing on 'main'. Maybe you have to rebase?
Don't forget to run cargo fmt and cargo clippy.
Otherwise, LGTM!
(turns out I'll need this one to pass seq tests with the common parsing methods!)
GNU testsuite comparison:
Congrats! The gnu test tests/tail/inotify-dir-recreate is no longer failing!