fast
fast copied to clipboard
fast.math: Is inline assembler really faster than compiler builtin?
GDC feeds all overflow intrinsics to the __builtin_xxx functions.
https://github.com/D-Programming-GDC/GDC/blob/16a15523c01e996660a0ad8b8043dd97d36535ee/gcc/d/d-codegen.cc#L3455-L3463
When I benchmarked it, the ASM came out on top at least for 32-bit. I only benchmarked on my Haswell CPU and the difference is not huge. It is entirely possible other models and in particular older 32-bit CPUs will behave differently. I have a Pentium 3 notebook lying around here and if I ever install a recent GDC on it I will run another comparison. Issue #9 is affecting performance much more. Cross-module inlining only works in all-at-once compiles, so generic functions like skipping over (and testing for) ASCII white-space need to reside in the same module as higher level functionality (JSON and other text parsers with insignificant white-space).