ryu
ryu copied to clipboard
Losing the battle to convert a string to double
Going from double to string is amazing using this lib, but I am having trouble efficiently and correctly parsing a string into a double. This is what I have, any ideas? Is there some ulfjack magic that could be worked here?
https://github.com/vinniefalco/json/blob/develop/include/boost/json/detail/number.ipp
If you know your input is the "shortest" format of a floating-point value of a given width, then there is in fact a fast algorithm to parse these values. I have some code for that. I'll need to do a bit of work in order to commit it here, but it shouldn't be much (famous last words).
@ulfjack There is an amazing library for Rust that parses float/double values precisely and fast
I've ported the moderate path from it to our library for Scala
I'm aware of the article. I think we can do better.
I have pushed some code to the https://github.com/ulfjack/ryu/tree/parsing branch. There are still big holes in the code, but that's the general shape. Note that it can only handle up to ~18 decimal digits (which is fine for input that was formatted as shortest), and I need to double-check that the tables have enough bits for this usage.
I pushed another commit filling in some of the missing pieces.
I pushed a couple of fixes to the branch. Before it can be merged, I need to figure out error handling, make it work on Mac/Win, and double-check that the bit widths are ok. More tests would also be good, but IMO not a blocker for merging into master.
@ulfjack please consider using of SWAR techniques for parsing the decimal representation, like here: https://github.com/sirthias/borer/issues/114
@vinniefalco posted on the PR: "How about 0.X1E1000 where X is a thousand zeroes"
Looks like you found a bug!
We were able to fix our number parser:
https://github.com/vinniefalco/json/blob/5aae31dc74d055d84a7f13e438d80cdf6005c670/include/boost/json/detail/impl/number.ipp#L155
However, it fails to produce the same result as stod for some cases. It is off by just one bit in those cases, which I guess is what all of this complicated looking code in s2d.c is for, which is to identify the cases where it needs to round up or down by one bit to find the nearest double? And this is all because converting from a base 10 exponent to a base 2 exponent is lossy?
@vinniefalco ICYMI: https://www.exploringbinary.com/17-digits-gets-you-there-once-youve-found-your-way/
Well.... it would have been nice to read that before I struggled to write the code !!!
@vinniefalco @ulfjack I am currently looking at fast string-to-double conversion, and reading this discussion it's not clear whether current ryu implementation is correct, or whether json implementation is good to use. Can one of those solutions be used yet?
The JSON implementation seems to work (i.e. passes tests) but it only implements the "fast path" conversion. Which means it can be off by one bit for some strings. And note that it is not officially part of Boost yet, and is still being worked on.