coreutils icon indicating copy to clipboard operation
coreutils copied to clipboard

Move ExtendedBigDecimal to uucore/format, make use of it in formatting functions

Open drinkcat opened this issue 9 months ago • 18 comments

Okay, this one is a bit of a large PR... I can split it up if needed (e.g. everything up to uucode: format: Change Formatter to take an &ExtendedBigDecimal is fairly mechanical). Also, I'm not too worried about having to rewrite stuff if needed (one of my purpose to contribute is to learn about Rust ,-)), so please let me know ,-)

The idea here is to switch formatting functions to use ExtendedBigDecimal, instead of f64. This allows us to get arbitrary precision with custom format in seq:

cargo run seq -f "%.32f" 0.1 1 1
0.10000000000000000000000000000000

Note that that doesn't match what coreutils outputs on x86, since they use an 80-bit int:

0.10000000000000000000135525271561

But that's better than before this change, where we used a f64:

0.10000000000000000555111512312578

At least now we have all the digits, so we could find ways to trim the floating point precision back to match 80-bit float if we wanted to.

We also take it as an opportunity to (partially) fix the hexadecimal format:

cargo run seq -f "%.32a" 0.1 1 1
0xc.cccccccccccccccccccccccccccccccdp-7

Again, doesn't fully match what coreutils does, but we could trim the precision:

0xc.ccccccccccccccd00000000000000000p-7

Note that I'm using seq and not printf, as the parsing code still uses f64... There's also quite a bit of potential for this change on the parsing side: I think a lot of custom logic in seq could then be removed.


seq: Make use of uucore::format to print in all cases

Now that uucore format functions take in an ExtendedBigDecimal, we can use those in all cases.

uucore: format: Pad non-finite numbers with spaces, not zeros

printf "%05.2f" inf should print inf, not 00inf.

Add a test to cover that case, too.

uucode: format: format_float_hexadecimal: Take in &BigDecimal

Display hexadecimal floats with arbitrary precision.

Note that some of the logic will produce extremely large BitInt as intermediate values: there is some optimization possible here, but the current implementation appears to work fine for reasonable numbers (e.g. whatever would previously fit in a f64, and even with somewhat large precision).

uucode: format: format_float_shortest: Take in &BigDecimal

Similar logic to scientific printing. Also add a few more tests around corner cases where we switch from decimal to scientific printing.

uucode: format: format_float_scientific: Take in &BigDecimal

No more f64 operations needed, we just trim (or extend) BigDecimal to appropriate precision, get the digits as a string, then add the decimal point.

Similar to what BigDecimal::write_scientific_notation does, but we need a little bit more control.

uucode: format: format_float_decimal: Take in &BigDecimal

Also add a few unit tests to make sure precision is not lost anymore.

uucode: format: format_float_non_finite: Take in &ExtendedBigDecimal

First modify Format.fmt to extract absolute value and sign, then modify printing on non-finite values (inf or nan).

uucode: format: Change Formatter to take an &ExtendedBigDecimal

Only changes the external interface, right now the number is casted back to f64 for printing. We'll update that in follow-up.

uucore: format: extendedbigdecimal: Implement From

Allows easier conversion.

uucore: format: extendedbigdecimal: Add MinusNan

Some test cases require to handle "negative" NaN. Handle it similarly to "positive" NaN.

uucore: format: Make Formatter a generic

Using an associated type in Formatter trait was quite nice, but, in a follow-up change, we'd like to pass a reference to the Float Formatter, while just passing i64/u64 as a value to the Int formatters. Associated type doesn't allow for that, so we turn it into a generic instead.

This makes Format<> a bit more complicated though, as we need to specify both the Formatter, and the type to be formatted.

seq: Move extendedbigdecimal.rs to uucore/features/format

Will make it possible to directly print ExtendedBigDecimal in seq, and gradually get rid of limited f64 precision in other tools (e.g. printf).

Changes are mostly mechanical, we reexport ExtendedBigDecimal directly in format to keep the imports slightly shorter.

drinkcat avatar Mar 15 '25 17:03 drinkcat

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

github-actions[bot] avatar Mar 15 '25 17:03 github-actions[bot]

why do move it to uucore? it seems that it is only used b seq, no ?

sylvestre avatar Mar 16 '25 09:03 sylvestre

I should have made this a bit more clear, this is just the first step in many.

why do move it to uucore? it seems that it is only used b seq, no ?

Correct, it was only used in seq, but now printf uses it too, for formatting only (no parsing yet). Also, in seq, I want to be able to directly format ExtendedBigDecimal when a format -f ... is specified on the command line (currently seq bypasses uucore/format for the default format, maybe we can simplify this already, I can look).

The eventual idea is to start using ExtendedBigDecimal in all floating number parsing and formatting (that's seq and printf yes). By using f64 currently, we lose digits compared to GNU coreutils implementations (which uses 64/80/128-bit floats depending on the native long double size), so we can't properly do hex float formatting in seq or printf (#7364), or actually, any format with high precision (e.g. %.24f).

The output is still different from GNU coreutils because:

  • printf uses uucore/format parsing code that still returns f64. So printf "%.24f" 0.1 => 0.100000000000000005551115 while GNU coreutils gives us 0.100000000000000000001355. We need to convert the parsing code to return ExtendedBigDecimal as well (we'd then get a clean 0.100000000000000000000000, see next point)
  • seq does parse ExtendedBigDecimal, and now prints ExtendedBigDecimal even when a format is specified, so it does output numbers that are more precise that GNU coreutils on any platforms (but now that we have the precision, we could trim it down).

Hope this makes sense ,-)

drinkcat avatar Mar 16 '25 16:03 drinkcat

Added 2 commits, to make full use of the new formatting function in seq:

  • uucore: format: Pad non-finite numbers with spaces, not zeros this is something seq did right, but not printf (i.e. uucore/format), so it's nice that both utilities use the same printing function now.
  • seq: Make use of uucore::format to print in all cases: That removes quite a bit of code that was, in some way, duplicated between uucore and seq.

(spotted #7466... for another PR)

drinkcat avatar Mar 16 '25 19:03 drinkcat

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

github-actions[bot] avatar Mar 16 '25 20:03 github-actions[bot]

Did you look at performances ? :)

sylvestre avatar Mar 18 '25 09:03 sylvestre

seq-main is main branch, ./target/release/seq is this branch.

GNU coreutils is much faster in most cases, so mostly focusing on main vs this branch:

4% worse than main on integers:

hyperfine  "./target/release/seq 1000000" "./seq-main 1000000" "seq 1000000"
Benchmark 1: ./target/release/seq 1000000
  Time (mean ± σ):     309.6 ms ±   4.9 ms    [User: 252.6 ms, System: 56.4 ms]
  Range (min … max):   303.7 ms … 318.0 ms    10 runs
 
Benchmark 2: ./seq-main 1000000
  Time (mean ± σ):     297.0 ms ±   6.2 ms    [User: 238.1 ms, System: 58.3 ms]
  Range (min … max):   289.3 ms … 312.8 ms    10 runs
 
Benchmark 3: seq 1000000
  Time (mean ± σ):      10.3 ms ±   2.1 ms    [User: 9.5 ms, System: 0.8 ms]
  Range (min … max):     5.0 ms …  13.6 ms    357 runs
 
Summary
  seq 1000000 ran
   28.77 ± 5.92 times faster than ./seq-main 1000000
   29.99 ± 6.16 times faster than ./target/release/seq 1000000

~15% worse on floats with default format:

hyperfine  "./target/release/seq 0 0.000001 1" "./seq-main 0 0.000001 1" "seq 0 0.000001 1"
Benchmark 1: ./target/release/seq 0 0.000001 1
  Time (mean ± σ):     336.4 ms ±   4.4 ms    [User: 277.1 ms, System: 58.5 ms]
  Range (min … max):   332.2 ms … 347.4 ms    10 runs
 
Benchmark 2: ./seq-main 0 0.000001 1
  Time (mean ± σ):     297.0 ms ±   2.0 ms    [User: 233.3 ms, System: 63.1 ms]
  Range (min … max):   294.2 ms … 300.9 ms    10 runs
 
Benchmark 3: seq 0 0.000001 1
  Time (mean ± σ):     111.8 ms ±   3.2 ms    [User: 111.3 ms, System: 0.5 ms]
  Range (min … max):   108.4 ms … 119.5 ms    26 runs
 
Summary
  seq 0 0.000001 1 ran
    2.66 ± 0.08 times faster than ./seq-main 0 0.000001 1
    3.01 ± 0.10 times faster than ./target/release/seq 0 0.000001 1

but 2+ times better on custom float formats, especially with large precision:

hyperfine  "./target/release/seq -f'%.24f' 0 0.000001 1" "./seq-main -f'%.24f' 0 0.000001 1" "seq -f'%.24f' 0 0.000001 1"
Benchmark 1: ./target/release/seq -f'%.24f' 0 0.000001 1
  Time (mean ± σ):     361.6 ms ±   8.9 ms    [User: 303.7 ms, System: 57.1 ms]
  Range (min … max):   353.9 ms … 383.0 ms    10 runs
 
Benchmark 2: ./seq-main -f'%.24f' 0 0.000001 1
  Time (mean ± σ):     876.4 ms ±   2.4 ms    [User: 808.3 ms, System: 66.2 ms]
  Range (min … max):   874.2 ms … 881.8 ms    10 runs
 
Benchmark 3: seq -f'%.24f' 0 0.000001 1
  Time (mean ± σ):     215.8 ms ±   5.1 ms    [User: 214.8 ms, System: 0.6 ms]
  Range (min … max):   209.4 ms … 224.2 ms    13 runs
 
Summary
  seq -f'%.24f' 0 0.000001 1 ran
    1.68 ± 0.06 times faster than ./target/release/seq -f'%.24f' 0 0.000001 1
    4.06 ± 0.10 times faster than ./seq-main -f'%.24f' 0 0.000001 1

drinkcat avatar Mar 18 '25 13:03 drinkcat

I could make seq 10% faster than main for integers by adding a bypass in the formatting code:

diff --git a/src/uucore/src/lib/features/format/num_format.rs b/src/uucore/src/lib/features/format/num_format.rs
index 3a22fe0442f9..83543f14c863 100644
--- a/src/uucore/src/lib/features/format/num_format.rs
+++ b/src/uucore/src/lib/features/format/num_format.rs
@@ -356,6 +356,9 @@ fn format_float_decimal(bd: &BigDecimal, precision: usize, force_decimal: ForceD
     debug_assert!(!bd.is_negative());
     if precision == 0 && force_decimal == ForceDecimal::Yes {
         format!("{bd:.0}.")
+    } else if bd.is_integer() {
+        let (bi, _) = bd.as_bigint_and_scale();
+        bi.to_str_radix(10)
     } else {
         format!("{bd:.precision$}")
     }

I suspect there's a lot of small optimizations we can do. But maybe in this case seq should detect we are dealing with integers and use i64 (or a BigInt)? We could also fall back to f64 when the precision doesn't matter (e.g., if you print a few digits only...).

I'll try to look a bit more at floats with default format...

drinkcat avatar Mar 18 '25 13:03 drinkcat

For the second case, we can reduce the slowdown to 5% with a bypass in write_output when width == 0:

diff --git a/src/uucore/src/lib/features/format/num_format.rs b/src/uucore/src/lib/features/format/num_format.rs
index 3a22fe0442f9..d8d60c8ba87f 100644
--- a/src/uucore/src/lib/features/format/num_format.rs
+++ b/src/uucore/src/lib/features/format/num_format.rs
@@ -614,6 +619,11 @@ fn write_output(
     width: usize,
     alignment: NumberAlignment,
 ) -> std::io::Result<()> {
+    if width == 0 {
+        writer.write(sign_indicator.as_bytes())?;
+        writer.write(s.as_bytes())?;
+        return Ok(())
+    }

Happy to add those 2 small optimizations if needed.

drinkcat avatar Mar 18 '25 14:03 drinkcat

yeah, probably better and don't hesitate to document your benchmark in the BENCHMARKING.md file

sylvestre avatar Mar 18 '25 14:03 sylvestre

Done, also filed #7482 for the most egregious performance gap.

(also discovered that useful hyperfine syntax: hyperfine -L seq seq,target/release/seq "{seq} 1000000")

drinkcat avatar Mar 18 '25 15:03 drinkcat

GNU testsuite comparison:

Skip an intermittent issue tests/misc/stdbuf (fails in this run but passes in the 'main' branch)
GNU test failed: tests/seq/seq. tests/seq/seq is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/seq/seq-extra-number. tests/seq/seq-extra-number is passing on 'main'. Maybe you have to rebase?
GNU test failed: tests/seq/seq-precision. tests/seq/seq-precision is passing on 'main'. Maybe you have to rebase?
Congrats! The gnu test tests/misc/tee is no longer failing!

github-actions[bot] avatar Mar 18 '25 15:03 github-actions[bot]

Forgot to run fmt/clippy... and didn't properly run tests... Fixed now.

Just for reference, benchmarks vs main: 25% better on integers, 5% slower on default floats:

$ taskset -c 0 hyperfine -L seq ./seq-main,target/release/seq "{seq} 1000000"
Benchmark 1: ./seq-main 1000000
  Time (mean ± σ):     294.2 ms ±   5.5 ms    [User: 231.2 ms, System: 61.0 ms]
  Range (min … max):   288.9 ms … 305.3 ms    10 runs
 
Benchmark 2: target/release/seq 1000000
  Time (mean ± σ):     235.1 ms ±   0.8 ms    [User: 180.8 ms, System: 53.8 ms]
  Range (min … max):   233.8 ms … 236.4 ms    12 runs
 
Summary
  target/release/seq 1000000 ran
    1.25 ± 0.02 times faster than ./seq-main 1000000
$ taskset -c 0 hyperfine -L seq ./seq-main,target/release/seq "{seq} 0 0.000001 1"
Benchmark 1: ./seq-main 0 0.000001 1
  Time (mean ± σ):     294.5 ms ±   0.5 ms    [User: 231.3 ms, System: 62.5 ms]
  Range (min … max):   293.8 ms … 295.8 ms    10 runs
 
Benchmark 2: target/release/seq 0 0.000001 1
  Time (mean ± σ):     310.0 ms ±   5.0 ms    [User: 248.8 ms, System: 59.2 ms]
  Range (min … max):   307.4 ms … 324.2 ms    10 runs
  
Summary
  ./seq-main 0 0.000001 1 ran
    1.05 ± 0.02 times faster than target/release/seq 0 0.000001 1

drinkcat avatar Mar 18 '25 17:03 drinkcat

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

github-actions[bot] avatar Mar 18 '25 17:03 github-actions[bot]

(I don't think those 2 failures are caused by this PR)

drinkcat avatar Mar 18 '25 18:03 drinkcat

GNU testsuite comparison:

GNU test failed: timeout/timeout.log. timeout/timeout.log is passing on 'main'. Maybe you have to rebase?
Congrats! The gnu test misc/stdbuf.log is no longer failing!

github-actions[bot] avatar Mar 19 '25 11:03 github-actions[bot]

Just noticed that this PR regressed uppercase hexadecimal float printing, so I added a commit here to fix that: cd30627b04361732bb72e74e3e9506b9e0e61b1a , and add a test so that it doesn't happen again. Rebased, too.

I also opened #7514 to fix one more small issue, and add a bunch more tests (didn't want to push too much new stuff here).

drinkcat avatar Mar 20 '25 19:03 drinkcat

GNU testsuite comparison:

Congrats! The gnu test timeout/timeout.log is no longer failing!

github-actions[bot] avatar Mar 20 '25 20:03 github-actions[bot]

GNU testsuite comparison:

Skip an intermittent issue tests/timeout/timeout (fails in this run but passes in the 'main' branch)

github-actions[bot] avatar Mar 22 '25 20:03 github-actions[bot]