fast_float
fast_float copied to clipboard
optimized (smaller) lookup table for float (binary32 only)
Have you considered optimizing the code size for parsing floats?
The LUT power_of_five_128 has approximately ~1400 entries which are needed for parsing doubles.
I don't know how many entries are required for parsing a float, but I suspect the LUT could be a lot smaller in that case.
If there was a separate LUT for parsing floats, the compiled binary size could be reduced significantly.
Pull requests invited!
To be clear here, if I understand correctly, @jrahlf wants an implementation that supports only binary32 numbers (float). Squeezing the table is easy, one can simply follow through the paper at https://arxiv.org/abs/2101.11408
Of course, the net result will only support binary32 numbers.
My mistake then, and I just confirmed that this would have off-by-1 values, which would mess up the logic.
https://github.com/fastfloat/fast_float/blob/8c4405e76e8bdac4246eb9973e75bdc7962c8dd5/include/fast_float/fast_table.h#L34
If you change these to float, the table size shrinks from 1302 to 208, i.e. you can save approximately 8kB.
So one could add another table power_of_five_128 for float and then let the templatized code use the correct table.
There is one catch: If you used both double and float, the code size would be greater (worse) than when only providing the double table. Two possible solutions:
a) compile time option when the user only wants to parse floats, then from_chars<double> is disabled
b) clever data packing so that only the float part of the table gets compiled into the binary, if only from_chars<float> is used. I am assuming here that the float table is a sub range of the double table. Is this correct, @lemire ?
I am assuming here that the float table is a sub range of the double table.
Yes, it is.
So I got a proof of concept: #103
I added the files: example_test_float.cpp and example_test_mixed.cpp.
With the HEAD version, the file sizes are as follows (Ubuntu gcc9.3):
34072 Sep 12 14:43 tests/example_test
34072 Sep 12 14:43 tests/example_test_float
42656 Sep 12 14:43 tests/example_test_mixed <-- not ideal
With the separate float LUT the sizes are:
34072 Sep 12 14:47 tests/example_test
25880 Sep 12 14:47 tests/example_test_float <-- saves 8kB as expected
42744 Sep 12 14:47 tests/example_test_mixed
There are two notable things:
- The extra float LUT only increases the mixed file size size by 100 Bytes, that is unexpected (in a good way). I expected an increase by 208 * 8Bytes = 1.6kB.
- The mixed file size is 8k larger than the double file. Heavy inlining might not be ideal for the mixed case (regarding code size). E.g.
readelfshows thatfast_float::parse_long_mantissahas a code size of 4kB and is instantiated for both float and double.
I would prefer to to make the double LUT a composite of the float LUT and additional data, but reading a composite object as one linear array would violate C++ aliasing rules. :(
However, this might be solvable with std::bit_cast ...
Overall it might makes sense to always use either double or float and not mix the types when parsing numbers.