HandmadeMath icon indicating copy to clipboard operation
HandmadeMath copied to clipboard

HMM_RSquareRootF has different precision between SSE and non-SSE

Open bvisness opened this issue 7 years ago • 2 comments

As of this writing, here is what HMM_RSquareRootF looks like:

HMM_INLINE float HMM_RSquareRootF(float Float)
{
    float Result;

#ifdef HANDMADE_MATH__USE_SSE
    __m128 In = _mm_set_ss(Float);
    __m128 Out = _mm_rsqrt_ss(In);
    Result = _mm_cvtss_f32(Out);
#else
    Result = 1.0f/HMM_SquareRootF(Float);
#endif

    return(Result);
}

This means that SSE builds will use an approximation, but non-SSE builds will not.

What do we want to do to fix this? Making the SSE version more precise or making the non-SSE version faster would both be breaking changes. But, I suspect that someone who's deliberately using an inverse square root would expect it to use an approximation.

bvisness avatar Nov 30 '18 15:11 bvisness

Id go with making the non-sse rsqrtf faster (use approximation). Generally most people are probably using SSE builds anyways. So the approximation is something they're probably used to.

strangezakary avatar Nov 30 '18 16:11 strangezakary

Typically people are used to SIMD functions having low precision, for example with custom cos or sin functions that expect inputs to be mapped from -pi to pi before calling. So I don’t think it’s too big of a deal.

RandyGaul avatar Jul 12 '19 18:07 RandyGaul