dragonbox
dragonbox copied to clipboard
Better support for ARM and other architectures
Thanks to Tim Kientzle (@tbkka), it turned out that the current code is not working correctly with ARM processors. I believe this is due to errors in fallback implementations of intrinsics. Also the current #define
switches are not able to correctly detect availabilities of intrinsics for ARM processors, resulting in suboptimal code generation.
Just fixed an error that I found. @tbkka, Could you test it with your test suite?
That addresses the cases where too few digits were being generated. I'm still seeing unexpected trailing zeros, though.
Could you give me some specific examples so that I can test on my machine?
Here are a few binary32 values that give me unexpected trailing zeros on ARM32 but give the expected result on x86_64:
Actual: 2.0E-43 Expected: 2E-43 Input: 0x1.1ep-142
Actual: 1.0097420E-28 Expected: 1.009742E-28 Input: 0x1p-93
Actual: 6.710890E7 Expected: 6.71089E7 Input: 0x1.000008p+26
Okay, I think I fixed it now.
By the way, do you know of a way to more correctly deciding availability of intrinsics like __builtin_ctz
that also works for ARM?
Currently I'm just doing something like
#if (defined(__GNUC__) || defined(__clang__)) && defined(__x86_64__)
...
#endif
or
#if (defined(__GNUC__) || defined(__clang__)) && defined(__SIZEOF_INT128__) && defined(__x86_64__)
...
#endif
@expnkx Did make that change in https://github.com/jk-jeon/dragonbox/commit/4032d671c375ac2475bc9a41b042b72d2a646c60, thanks.
I still think we can further improve the coverage by taking care of ARM32, so I'll leave this open.
Would 8-bit / AVR compatibility make any sense for this algorithm?
@malachib Admittedly that's something that has never been in my consideration so far. But, I'd say, why not?
First of all, the code as-is is supposed to be portable as long as things like std::uint64_t
are supported by the toolchain, either with direct hardware support or software emulation. Do toolchains you have in mind support fixed-sized int's up to 64-bit? How about std::uint_least64_t
and friends?
However, the code currently just assumes that float
is IEEE-754 binary32 and double
is IEEE-754 binary64. If that's not the case and if you try to use the mismatched one, the compilation will fail. (Actually, there is a possibility that it compiles and silently does nonsensical things in runtime, but I think that's unlikely.)
I did a bit of googling and it seems that floating-point support in 8-bit AVR toolchains is sort of irregular, is that right? Like, some toolchains use binary32 for both float
and double
, and some others use some variations of IEEE-754 instead, etc..
I've been keeping in mind that the library eventually needs to support some exotic float
/double
and even other user-provided float-like types, but I first need to know what exactly are the target formats that need to be supported. Can you give me some information about this?
Also, will the resulting code really be appropriate for specific platforms in your mind regarding performance/binary size is another question that I have no idea how to answer. But I'd want to see how it goes.
Thanks for the interest!
@jk-jeon a very thoughtful response, thank you!
I typically in fact use ESP32 Espressif chips, but for auxiliary testing I use Arduino AVR which last I checked used a very old GCC (last I checked it was 4.x or 5.x, but apparently they are up to 7.x https://github.com/arduino/toolchain-avr ). I doubt the format has changed between those releases - I always presumed they used IEEE-754, Presumed being the key word.
I did notice you had some specialization possibilities for other formats, kudos to that architecture. AVRs definitely come with serious size constraints, so even a 512-byte ROM lookup buffer isn't a slam dunk for those little guys. But might be worth it.
I wanted to ask the question because the algorithm perhaps isn't even viable without native 32-bit/64-bit integer support at the MCU level. Basically asking more questions than giving answers... lol.
@malachib So I checked the AVR LibC user manual, and it seems:
- All the 64-bit integer types are supported, unless the user specifies the compiler option
-mint8
which turnsint
into 8-bit, which violates the C/C++ standards and seems to be more or less not recommended. - Both
float
anddouble
are IEEE-754 binary32. (The manual only says they are both 32-bits and doesn't explicitly says that they conform to IEEE-754 binary32, but it seems this is kind of implied here and there.)
Given these, I think probably the code as-is may just work there, both for float
and double
, given the caveats:
- I had no idea about this Harvard vs von Neumann shenanigans until now. Ideally static data probably needs to be on the flash memory along with the program source code so we need to decorate their definitions with some macros. Is that a correct understanding?
- As you said the size of the lookup table currently is probably too big. It's 624 bytes for binary32, but I think we can compress it down further using a similar mechanism I do for compressing the table for binary64. I can't know what will be the resulting size, but I guess it would be around 100 bytes. For binary64, I think we could probably compress it down further (it's currently 586 bytes or something like that when compressed) , but I'm not sure if that's really worth doing in general, given that this specific AVR toolchain doesn't seem to care about binary64 at all, and it kinda sounds ridiculous to even try to do something with binary64 on a less powerful hardware where 600~ish bytes data are already too much. But I have no hard opinion.
- I have no idea how large the stack usage is. It also will be a bit larger if table compression is used. I probably need to think about ways to reduce the stack usage, and a mechanism to request that through policy parameters.
- Most importantly, I have no hardware to test some of the things mentioned above. I definitely need some help in this regard.
EDIT: Ah, wait. No, there is a big "if" here: I need several standard C++ headers... I obviously don't use some "advanced" things like <thread>
or <iostream>
but still...
So there seems to be several projects that tries to port freestanding C++ headers into 8-bit AVR chips, but I'm not sure if you're okay with relying on such a thing.
@jk-jeon right on. I have lots of thoughts on this
All the 64-bit integer types are supported, unless the user specifies the compiler option -mint8 which turns int into 8-bit, which violates the C/C++ standards and seems to be more or less not recommended.
Anyone who turns on that option and expects standards-conforming libraries to function OK are summarily dismissed
Both float and double are IEEE-754 binary32. (The manual only says they are both 32-bits and doesn't explicitly says that they conform to IEEE-754 binary32, but it seems this is kind of implied here and there.)
Not surprising. Long double, too https://www.avrfreaks.net/s/topic/a5C3l000000UbfZEAS/t158593
I had no idea about this Harvard vs von Neumann shenanigans until now. Ideally static data probably needs to be on the flash memory along with the program source code so we need to decorate their definitions with some macros. Is that a correct understanding?
Yes. PROGMEM
is likely our tool of choice here - really only needed for the const variables. Functions are automatically put into flash/rom usually
As you said the size of the lookup table currently is probably too big. I do for compressing the table [...] would be around 100 bytes. [...] ridiculous to even try to do something with binary64 on a less powerful hardware where 600~ish bytes data are already too much.
100 bytes would be great. Definitely let's not shoot for binary64. I mean, we (you?) would have to spin up a whole custom float64 type, right? If anyone truly needs it, I motion they have to ask.
Your stack stuff I can only rely on your good judgement. AVRs definitely don't have much RAM. But, so many things are byte-aligned and IIRC pointers are 16 bit. So that's gotta help.
Most importantly, I have no hardware to test some of the things mentioned above. I definitely need some help in this regard. So there seems to be several projects that tries to port freestanding C++ headers into 8-bit AVR chips, but I'm not sure if you're okay with relying on such a thing.
I got you covered in this area.
I am the author of one of those projects https://github.com/malachi-iot/estdlib and even started some preliminary work in anticipation of porting your code in.
I use AVRs and ESP32s for my J1939/CAN projects (my own float->char converter is truly gruesome). I'm eager to chuck in your code to test it. Though I can tell you I will definitely need that 100 byte optimization ... fitting J1939 on an AVR with 32k of ROM is no easy feat!
Perhaps we should start a github discussion?
Anyone who turns on that option and expects standards-conforming libraries to function OK are summarily dismissed
I agree.
Not surprising. Long double, too https://www.avrfreaks.net/s/topic/a5C3l000000UbfZEAS/t158593
From the link:
HOWEVER if you like to run at the bleeding edge of technology then note that in the last month GCC v10 was released and one of the changes in that is that avr-gcc has now got options for switching double to be sizeof()=8.
Hmm.... okay but if I really start working on this I will consider binary64 much later.
Yes. PROGMEM is likely our tool of choice here - really only needed for the const variables. Functions are automatically put into flash/rom usually
Yeah, I think I need some mechanism to detect the toolchain and define an appropriate macro accordingly.
100 bytes would be great.
It requires some work. I think I will probably work on it in the near future but please don't expect it to be done too soon.
I mean, we (you?) would have to spin up a whole custom float64 type, right? If anyone truly needs it, I motion they have to ask.
So according to the link you provided, double
can be binary64, so that's not strictly the case. Also, as long as it is IEEE-754 binary64-encoded, the exact spell of the type being double
or not is not a very important detail. But anyway it seems reasonable to first not care too much about binary64.
I got you covered in this area.
Neat. But if it's not supposed to be a conforming (freestanding or not) standard library implementation, I need to provide some way to configure the library to replace those std functions I'm currently relying on. But I don't want to uglify the code by replacing std::uint32_t
by JKJ_DRAGONBOX_UINT32_T
and alike.
An idea that immediately pops up in my mind is to detect a user-provided macro JKJ_DRAGONBOX_STD_REPLACEMENT_NAMESPACE
, and if that is defined then don't include any std headers. (The user is supposed to include all necessary headers directly then.) And then have something like
namespace stdr = JKJ_DRAGONBOX_STD_REPLACEMENT_NAMESPACE;
inside the jkj::dragonbox::detail
namespace and replace all std::
with stdr::
.
Downsides are (1) all needed stuffs needs to be in the same namespace, and they should have the exact same (unqualified) names as their std counterparts, (2) including necessary headers might be annoying, and it now turns the list of std things this library internally uses into a part of the library's interface, (3) not sure what will happen to the "blessed functions" like std::memcpy
or std::bit_cast
, maybe like losing constexpr
-ness or introduce UB?
Any ideas?
Perhaps we should start a github discussion?
Please feel free to make one!
By the way, just to make sure, you're aware that Dragonbox cannot be used as a compatible replacement of printf("%f")
(as the former does shortest-roundtripping formatting and the latter does fixed-precision one), right? If you ever want to have a proper replacement of printf
then that's a whole another can of worms...
#57 Discussion started!