Quadmath.jl Float128 fma() subtly corrupts state on Windows

trafficstars

On Windows (but not Linux or OSX) calls to fma() somehow break the overflow handling.

julia> h, x = floatmax(Float128), Float128(1)
julia> h + h
inf
julia> fma(x,x,x);
julia> h + h
1.18973149535723176508575932662800701e+4932

I don't see anything wrong with our wrappers; it may be a libquadmath and/or libgcc issue.

Apr 20 '19 21:04 RalphAS

Is it correct for other values? Not sure if this is related: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89459

Apr 21 '19 05:04 simonbyrne

After calling fma on Float128, overflow seems generally broken for +,-,*,/ with Float128 or Float64 arguments (tested with mixtures of huge and moderate values). I surmise that some floating point control structure in the mingw runtime is not being reset properly.

Since things work on Linux, I don't think this one is an upstream GCC issue. (I believe I'm using a fairly old GCC on Linux without trouble here.)

Apr 21 '19 18:04 RalphAS

ah, if it's breaking Float64 rounding, then I suspect that is related to https://github.com/JuliaLang/julia/issues/9847#issuecomment-71174579 / https://sourceware.org/bugzilla/show_bug.cgi?id=17907

In that case, maybe we should just disable fma on Windows? Or call the MPFR one?

Apr 22 '19 02:04 simonbyrne

I'm really here for the trouble it gave me in #30 and the challenge of a level-1 puzzle to pin that down, so either of your suggestions LGTM.

Apr 22 '19 04:04 RalphAS

Maybe we just remove fma on Windows for the time being?

Apr 22 '19 17:04 simonbyrne

I have disabled fma on Windows for now. Once we have a better solution we can add it back.

Apr 24 '19 04:04 simonbyrne

Opinion?

@static if Sys.iswindows()
    const SPLITTER = 1 + Float128(2)^57
    const THRESHOLD = Float128(2)^16326

    function two_sum(x::Float128, y::Float128)
        hi = x + y
        a  = hi - y
        b  = hi - a
        lo = (x - a) + (y - b)
        return hi, lo
    end
    
    function sum_three(x::Float128, y::Float128, z::Float128)
        s, t  = two_sum(y, z)
        u, v  = two_sum(x, s)
        hi    = t + v + u
        return hi
    end
    
    function splitter(x::Float128)
        x < THRESHOLD || throw(DomainError("Threshold $THRESHOLD exceeded"))
        a = x * SPLITTER
        b = x - a
        x1 = a + b
        x2 = x - x1
        return x1, x2
    end
    
    function two_prod(x::Float128, y::Float128)
        hi = x * y
        x1, x2 = splitter(x)
        y1, y2 = splitter(y)
        lo = x2 * y2 - (((hi - x1*y1) - x1*y2) - x2*y1)
        return hi, lo
    end 
 
    function fma(x::Float128, y::Float128, z::Float128)
        xyhi, xylo = two_prod(x, y)
        result = sum_three(z, xyhi, xylo)
        return result
    end
    
 end

May 03 '19 04:05 JeffreySarnoff

fmas are subtly difficult to get correct (especially dealing with underflow and subnormals). I think the BigFloat option is probably the best for now until we can figure out what is going on.

May 03 '19 17:05 simonbyrne

It's fine by me -- the more important aspect of this is let the Windows version work for clients who use fma. Why not implement the BigFloat fma stopgap until the cause is dissolved or resolved?

May 08 '19 01:05 JeffreySarnoff

Thought it worth the revisit

if Sys.iswindows()
    function Base.fma(x::Float128, y::Float128, z::Float128)
        oldprec = precision(BigFloat)
        setprecision(BigFloat, 512)   # at least 448 == Int(128 * 3.5) 
        result = Float128(x * y + z)
        setprecision(BigFloat, oldprec)
        return result
    end
else
   # ccall
end

Jun 14 '20 11:06 JeffreySarnoff

@JeffreySarnoff would you mind opening a pull request?

Isn't fma defined for BigFloat?

Jun 15 '20 15:06 simonbyrne

fma is defined for BigFloat and the logic will process fma(x,y,z) at the precision set for BigFloat. That precision may be well in excess of the precision demanded for Float128s and take longer than it might, or it may be much less than that and deliver an imprecise result. Either way, we need to manipulate setprecision(BigFloat) when fma is called (for Win). I'll PR it.

Jun 15 '20 17:06 JeffreySarnoff

Has this "On Windows (but not Linux or OSX) calls to fma() somehow break the overflow handling." fixed itself with the current libquadmath that we use? If so, we should remove the code that blocks fma on Windows.

Apr 16 '22 15:04 JeffreySarnoff

It appears to be unchanged for current and nightly Julia.

The example at the top is included as a broken test for Windows in the commit shown here: https://github.com/RalphAS/Quadmath.jl/actions/runs/2178209027

Apr 16 '22 23:04 RalphAS

Quadmath.jl Quadmath.jl copied to clipboard

Float128 fma() subtly corrupts state on Windows

Quadmath.jl
Quadmath.jl copied to clipboard