math
                                
                                 math copied to clipboard
                                
                                    math copied to clipboard
                            
                            
                            
                        Can boost::math::float_distance be sped up?
In the AGM PR, I have found that ~90% of the runtime is spent computing float distances. However, at least for float and double, the following trivial modification drops the runtime to a negligible fraction of the total runtime:
    int32_t fast_float_distance(float x, float y) {
        static_assert(sizeof(float) == sizeof(int32_t), "float is incorrect size.");
        int32_t xi = *reinterpret_cast<int32_t*>(&x);
        int32_t yi = *reinterpret_cast<int32_t*>(&y);
        return yi - xi;
    }
    int64_t fast_float_distance(double x, double y) {
        static_assert(sizeof(double) == sizeof(int64_t), "double is incorrect size.");
        int64_t xi = *reinterpret_cast<int64_t*>(&x);
        int64_t yi = *reinterpret_cast<int64_t*>(&y);
        return yi - xi;
    }
It seems like boost::math::float_distance is considerably more general than this, but can we dive through a happy path to extract performance in the trivial cases?
Thats interesting! Does it pass the tests?
@jzmaddock : Yes; here's some background on the trick.
OK, but as pointed out in the article, your trick fails when the two inputs differ in sign (this includes when one input is zero).  I also get a negative rather than positive result for say fast_float_distance(-1, -0.5).  All of which can be fixed with some special case handling of course...
@jzmaddock : Yeah, the fact that agm requires positive numbers (over the reals) simplifies the logic considerably.
The wins for float128 are pretty huge:
Without the fast float distance:
AGM<boost::multiprecision::float128>       8411 ns         8377 ns        82937
with it:
AGM<boost::multiprecision::float128>       2241 ns         2230 ns       313072
Implementation:
#ifdef BOOST_HAS_FLOAT128
    __int128_t fast_float_distance(boost::multiprecision::float128 x, boost::multiprecision::float128 y) {
        static_assert(sizeof(boost::multiprecision::float128) == sizeof(__int128_t), "double is incorrect size.");
        __int128_t xi = *reinterpret_cast<__int128_t*>(&x);
        __int128_t yi = *reinterpret_cast<__int128_t*>(&y);
        return yi - xi;
    }
#endif
I couldn't get it to work with long double, sadly.
long double is quite irritating sometimes. However it's also useful ;)