ryu icon indicating copy to clipboard operation
ryu copied to clipboard

Missing loop in two-digit optimization?

Open cespare opened this issue 6 years ago • 3 comments

Hey @ulfjack, thanks for the great algorithm!

Quick question: should this be a loop rather than an if, or was that intentional?

https://github.com/ulfjack/ryu/blob/b2ae7a712937711375a79f894106a0c6d92b035f/ryu/d2s.c#L414

I haven't tried building your benchmarks yet, but in my own implementation based on your C code changing this to a loop gives a noticeable speedup for formatting numbers like 0.3.

cespare avatar Jan 13 '19 02:01 cespare

This was my optimization and it was intentionally written as a single test instead of a loop. For random doubles, typically the “remove two digits” optimization can be applied exactly once, and testing again would degrade performance. Only numbers with a very short round-trip format, like 0.3, benefit from repeated applications. It’s a judgment call, but I chose to avoid introducing assumptions that the input is likely to be friendly to the optimization. The comments about loop iterations were profiled from random doubles.

This is also why the float codepath lacks the optimization - random floats don’t typically have 2 digits to trim.

StephanTLavavej avatar Jan 13 '19 09:01 StephanTLavavej

@StephanTLavavej Please reconsider using a 2-digit loop with the 1-digit conditional step instead.

This solution has the same number of division operations as the currently implemented a 2-digit conditional step with the 1-digit loop for most fast cases, but it greatly reduces them for numbers with small mantissas.

Currently this weak performance comparing with the best C/C++ dtoa routines for small mantissas prevents wider adoption of the Ryu algorithm, see this issue for example.

Here is how that rounding branch was implemented in Scala for jsoniter-scala.

Also, here are proposed changes with benchmark results for the Rust implementation.

plokhotnyuk avatar May 25 '19 05:05 plokhotnyuk

I can't recall if I tried the approach of looping for 2 and testing for 1. I can profile again. I know that it helps short outputs (many digits to trim), and it looks like it should help the case where there's only 1 digit to trim, but I'm worried that it'll slow down the most common case for random doubles (exactly 2 digits to trim). Perhaps the downside is minimal.

StephanTLavavej avatar May 26 '19 05:05 StephanTLavavej