simdjson
simdjson copied to clipboard
get_float() to retrive 4byte float missing.
Hi folks!
We need to retrieve a lot of float information from JSON files but the only get() helper we found in your lib is the get_double() which returns an 8byte double variable. This is a serious performance issue due to conversion we need to do to float value.
¿Can you help use about? ¿Can you please guide us and tell if we have some f(x) to get 4byte float in your lib?
Thanks.
This is a serious performance issue due to conversion we need to do to float value
Parsing a string into a float requires hundreds of instructions. Conversion from double to float is a single instruction on most systems. The cost is comparable to an additional multiplication. So it is unlikely to make a measurable difference.
There are other good reasons to add a get_float()
function, however. It is a valid issue.
Thanks.
Furthermore, a get_float()
function should give an error when the value is too large (e.g., 1e300
).
Thanks a lot for your quick replly! I'm not sure I get meaning about your reply.
mmm when you say string to float requires hundreds of instructions, Do you refer to some internal procedure you use which allows you to convert to double more efficiently than float? For us, this layer should be abstract.
In terms of high performance, for us, if we retrieve from a high loaded json file of float values, we need to std::static_cast
If you consider to add get_float(), Can you please let some idea about when you can have this feature ready to be used? For us will be very nice to know as much detailed schedule as you can. :)
Again, thanks a lot for your support.
You should be able to convert doubles into floats at tens of gigabytes per second. It is essentially free compared to anything else you might be doing when ingesting JSON files.
We have no timeline at the moment, but if you'd like to sponsor this feature with funding, we could do it faster.
In the following blog post, I make the point that it is unlikely that the conversion from double to float can be a performance bottleneck on current commodity processors. The conversion is single instruction that can be retired once a cycle (on most systems):
https://lemire.me/blog/2022/07/20/how-quickly-can-you-convert-floats-to-doubles-and-back/