serenity AK+LibJS: Exact and fast floating point to string conversion

Currently, the floating point to string conversion is implemented several times across the codebase. Furthermore, there are some test failures in LibJS because of multiple imprecise algorithms implemented there. So, this PR aims to provide a single, fast and precise algorithm of doing the aforementioned conversion.

The algorithm is Ryu from https://dl.acm.org/doi/pdf/10.1145/3192366.3192369, which does not require any arbitrary precision arithmetic.

This is WIP but, as it is my first contribution, I want some comments on the already added code.

Oct 25 '22 18:10 DanShaders

Also in case you're interested there are even newer algorithms like dragonbox: https://github.com/jk-jeon/dragonbox which is used by libfmt

Oct 25 '22 20:10 davidot

Well, I hadn't seen it when I was doing the initial research. Anyway, I believe that what is currently implemented would already be a significant improvement over a for loop with fmod. So, I want this PR to stay with the Ryu and then potentially swap it for something better.

Oct 25 '22 20:10 DanShaders

Well, I hadn't seen it when I was doing the initial research. Anyway, I believe that what is currently implemented would already be a significant improvement over a for loop with fmod. So, I want this PR to stay with the Ryu and then potentially swap it for something better.

Yes we can just start with Ryu, don't think we need the performance difference of dragonbox or the others

Oct 25 '22 20:10 davidot

Firstly, I've stress tested double conversion (checked 10^11 random bit patterns of double and all doubles with mantissa < 2^20 or mantissa > 2^52 - 2^20) and brute forced all possible floats against dragonbox. Secondly, I've run some benchmarks: our implementation of Ryu converts double in 30ns and float in 35ns on average on my machine (i7-12700h) (versus constant 10ns from Dragonbox).

All of the benchmarks are here. I do not really want to leave this code in the separate repository but do not know where to place it here (and it also depends on the external library).

I guess the PR is already too big for first, so will leave the benchmarking code intact as well as the other double -> string conversions in AK and LibJS.

Oct 27 '22 05:10 DanShaders

@DanShaders The minmax Euclid algorithm appearing in the Adams' paper is not entirely correct. For example, with the Python code you embedded in the cpp file, minmax_euclid(3,8,5) prints (1,2) which means the minimum is 1 and the maximum is 6. But 3 * 5 mod 8 = 7 so actually 7 is also possible. I actually do not believe the proof written in the paper. When I was looking at it, after several trials I just gave up comprehending it as it just looked very suspicious to me.

Also according to what I remember the algorithm as written is ridiculously slow. No idea how long did it actually take for giving you the table though. I guess maybe Adams himself is also aware of these issues and maybe the one that is currently in the reference implementation repo is probably the one with these issues addressed.

Even if the verification algorithm is wrong, the actual lookup table you got is probably okay. But who can be perfectly sure about that. I guess maybe you need to perform either the complete exhaustive test (it sounds like you already did for float, and it's practically impossible for double), or compare them one-by-one with the one that is in the reference repo (because Ryu reference implementation is dauntingly heavily tested), or re-do the precision sufficiency check with a correct algorithm.

In this paper, page 22, you can find an alternative algorithm which gives the exact min/max, which I believe to be actually correct. (I guess likely the one Adams is currently using is something similar to this.) Also this runs much faster than the one given in the Adams' paper. With my C++ implementation it runs instantaneously, though it can be somewhat slower in Python. I wrote a very long-winded mathematical proof of this algorithm, and quite confident about the algorithm's correctness, although nobody proofread the proof I wrote. Actually I do not use this algorithm anymore because it has been superseded by a more generic routine of computing so-called the best rational approximations. I would say this is indeed a strictly better approach, but if you want a quick drop-in replacement for the current implementation of minmax_euclid, then the one written in the paper mentioned above might be a good fit.

Oct 28 '22 08:10 jk-jeon

@linusg Done @jk-jeon As for speed, it wasn't that bad, though I only tried running it for numbers up to 80 bits. With the "more likely to be correct algorithm" the bounds (B0 and B1) are the same, so the tables should indeed be correct. Actually, I'm still not being able to get bounds mentioned in the Ryu paper (mine are off by 1 or 2). This is not a big deal for doubles, but for floats mine produce coefficients larger than UINT64_MAX, so I had to manually overwrite them and check correctness via brute force. In the reference implementation, there are some unclear bit length computations around here where these numbers are coming from. I'm wondering if you've encountered the same problem before.

Oct 28 '22 16:10 DanShaders

@DanShaders First of all, I have never implemented Ryu by myself, although I did implement several different algorithms doing the same thing and utilized the (corrected version of) minmax Euclid for that.

The exact table entries I used are, if I recall correctly, somewhat different from Ryu. I aligned all of the first 1's to the MSB of the table entries, not sure if Ryu also does that. If not, that's probably the most prominent difference. (Plus the one in the reference repo has more entries than necessary, because it also contains the ones needed for the parsing.)

With the "more likely to be correct algorithm" the bounds (B0 and B1) are the same, so the tables should indeed be correct.

Did you already check that? So fast😁

This is not a big deal for doubles, but for floats mine produce coefficients larger than UINT64_MAX

From which exponent did you get that?

In the reference implementation, there are some unclear bit length computations around here where these numbers are coming from.

Isn't this just Lemma 3.4 in the Ryu paper? Could you elaborate why do you think it's unclear?

Oct 28 '22 17:10 jk-jeon

@jk-jeon Yes, I'm definitely not aligning table entries.

From which exponent did you get that?

Oh, technically I have a bug in my script (I forgot about the counterintuitive rounding towards zero of // for negative numbers in Python). Nonetheless, the paper also seems to have a problem: I assume that formulas in subsection 3.4 step 3' case $e_2 < 0$ should be the following:

$$ \begin{align} q &= \max\left(0, \lfloor -e_2 \log_{10} 5 \rfloor - 1 \right) \ k &= \lceil \log_2 5^{-e_2 - q} \rceil - B_1 \ \end{align} $$

For $e_2 = -140$ and $B_1 = 63$, we get $q = 96$ and $k = 40$. However, using lemma 3.4 we get

$$ k \le \log_2 \left( \frac{\min((w \cdot 5^{-e2-q})\mod 2^q)}{\max(w)} \right) < 39.926 $$

assuming $\max(w) = 67'108'862$ and substituting $w = 45'038'809$ for minimum. So, that is why 63 is a supposedly incorrect bound.

However, if I round everything correctly, using 64 instead of 63 is not a problem. ~~As everything works regardless, I'm not really certain but I might fix the rounding in my code.~~ Fixed

Could you elaborate why do you think it's unclear?

I was not able to understand how those bit length manipulations resulted in the correct rounding, and, as we can see, they didn't.

Actually, if someone could check the computations, I might also open an issue in ryu repo.

Oct 28 '22 23:10 DanShaders

@DanShaders

Basically I just skimmed through the proof, copied the code from the article and checked that $B_{0}$ and $B_{1}$ are the same.

Actually I'm quite doubtful about that. Are you sure you get more than 64-bits for IEEE-754 binary32?

Oh, technically I have a bug in my script (I forgot about the counterintuitive rounding towards zero of // for negative numbers in Python)

You mean so your $B_{0}$ and $B_{1}$ are actually not more that 64 for IEEE-754 binary32? Also so sad that C++ also does the same thing for integer division... Apparently it is because that's how most of the actual hardware work. (Here is the defect report which mandated this behavior.)

Nonetheless, the paper also seems to have a problem:

As far as I can tell there is nothing wrong in the statement/proof of Lemma 3.4. I guess you mean that $q$ is not equal to what it is supposed to be, which kind of makes sense b/c it doesn't match the $q$ given in Lemma 3.2. Probably the author mistakenly thought that $\log_{10}5$ is equal to $\log_{5}2$. I don't know if the specific $q$ you proposed is the right one though, as it's still off by 1 from what's given in Lemma 3.2. The proof of $e_{2}<0$ case of Lemma 3.2 is omitted in the paper, so maybe, the author is again wrong at this point, but I didn't check that. Are you sure $-1$ is the correct one?

However, if I round everything correctly, using 64 instead of 63 is not a problem.

As far as everything is inside $64$-bits there should be no problem yeah.

Oct 29 '22 01:10 jk-jeon

@jk-jeon

Actually I'm quite doubtful about that. Are you sure you get more than 64-bits for IEEE-754 binary32?

No, I'm not anymore.

You mean so your $B_0$ and $B_1$ are actually not more that 64 for IEEE-754 binary32?

Exactly.

Also so sad that C++ also does the same thing for integer division...

Off topic: I have CP background where we usually haven't dealt with negative numbers in the context of floor/ceil (except for some problems on number theory), so I'm very used to floor(a/b) = a // b)

As far as I can tell there is nothing wrong in the statement/proof of Lemma 3.4. I guess you mean that q is not equal to what it is supposed to be, which kind of makes sense b/c it doesn't match the q given in Lemma 3.2. Probably the author mistakenly thought that log10⁡5 is equal to log5⁡2. I don't know if the specific q you proposed is the right one though, as it's still off by 1 from what's given in Lemma 3.2. The proof of e2<0 case of Lemma 3.2 is omitted in the paper, so maybe, the author is again wrong at this point, but I didn't check that.

I don't see how the proof depends on the exact value of q, so it should work at least for $q \in [0; \lfloor -e_2 \log_{10} 5\rfloor]$. The contradiction is that $k$, given by $k = \lceil \log_2 5^{-e_2 - q} \rceil - B_1$ with $B_1=63$, should be allowed ("and then use B1 to determine a legal value for k as") but it is not as $40 > 39.926$. Of course, I'm not entirely sure about this exact formula for $q$ but it is what I use (and successfully stress against dragonbox) and it follows the logic of "the largest which does not break loop invariant - 1".

Just to clarify: I define $B_1$ as the minimal number such that $\forall -e2 - q : \lceil \log_25 * (-e2 - q) \rceil - B_1 <= k_\text{max}$ and then use "the bit length of the coefficients is no more than $B_1$" as the corollary.

Oct 29 '22 02:10 DanShaders

@DanShaders There is no contradiction if you choose $q = \lfloor-e_{2}\log_{10}5\rfloor$. Then $e_{2}=-140$, $q=97$, so according to your definition, $k$ becomes $37$. I haven't checked what is the bound Lemma 3.4 gives in this case, but probably it's more than $37$.

Oct 29 '22 02:10 jk-jeon

@jk-jeon The value is somewhere between 39 and 40. As far as I can tell, they are using two different formulas for q: one with -1 (https://github.com/ulfjack/ryu/blob/75d5a85440ed356ad7b23e9e6002d71f62a6255c/ryu/d2s.c#L158) and one without (https://github.com/ulfjack/ryu/blob/75d5a85440ed356ad7b23e9e6002d71f62a6255c/ryu/f2s.c#L112)...

Oct 29 '22 02:10 DanShaders

@jk-jeon Hmm, earlier d2s also used formula without -1 if NICER_OUTPUT was not defined. https://github.com/ulfjack/ryu/blob/365dd77ddfca1e38ae88099b53d4c99c21b82a0e/ryu/d2s.c#L525

Oct 29 '22 03:10 DanShaders

The value is somewhere between 39 and 40.

I just ran a computation. The bound is between 39 and 40 when $q = \lfloor -e_{2}\log_{10}5\rfloor - 1$. If you use $q = \lfloor -e_{2}\log_{10}5\rfloor$ instead, then it is between 43 and 44.

Hmm, earlier d2s also used formula without -1 if NICER_OUTPUT was not defined.

My conspiracy theory: the formula without -1 is the correct one, but later they discovered that the one with -1 actually leads to the same output for double and has a bunch of benefits so they just got stuck to it.

Oct 29 '22 03:10 jk-jeon

My conspiracy theory: the formula without -1 is the correct one, but later they discovered that the one with -1 actually leads to the same output for double and has a bunch of benefits so they just got stuck to it.

This seems reasonable.

I guess we are not discussing the PR anymore. Nonetheless, thank you for your time. Just to make sure that we are on the same page: would you say that the code in terms of the algorithms used should be correct?

Oct 29 '22 03:10 DanShaders

@DanShaders As I pointed out in the first comment I actually haven't implemented Ryu by myself, in fact I even have never read the Ryu paper thoroughly except for some parts from Section 3.2. So I cannot have any idea on your actual code on Ryu itself. (I can certainly believe that it is correct as you did a nice stress test against Dragonbox reference impl.) The only thing I was worried about was the correctness of minmax_euclid. I have seen several people who reimplemented Ryu but I am not aware of anyone else who actually re-created the table by directly using minmax_euclid. Since minmax_euclid has some issue and my impression was that Adams is actually using a different routine, I just wanted to make sure that the issue is not really materialized. As it seems like you replaced the embedded Python code into the one using the corrected algorithm, I'm all happy now.

Oct 29 '22 04:10 jk-jeon

Looks like I created a conflict by merging https://github.com/SerenityOS/serenity/pull/15757, sorry about that!

Nov 03 '22 04:11 ADKaster

I'm kinda losing hope for getting this merged. It was supposed to be a quick fix :)

Nov 03 '22 22:11 DanShaders

This merge breaks the build on M1 Mac. macOS Version 12.6

In file included from /Users/joshua/dev/org/serenity/AK/StringFloatingPointConversions.cpp:9:
/Users/joshua/dev/org/serenity/Meta/Lagom/../../AK/FloatingPoint.h:21:1: error: floating constant exceeds range of 'long double' [-Werror=overflow]
   21 | static_assert(__LDBL_MAX__ == 1.189731495357231765e4932L);
      | ^~~~~~~~~~~~~

Nov 04 '22 15:11 Vbitz

Yep see #15920 for the current attempted fix.

Nov 04 '22 16:11 ADKaster

serenity serenity copied to clipboard

AK+LibJS: Exact and fast floating point to string conversion

serenity
serenity copied to clipboard