Quadmath.jl
Quadmath.jl copied to clipboard
Float128 fma() subtly corrupts state on Windows
On Windows (but not Linux or OSX) calls to fma() somehow break the overflow handling.
julia> h, x = floatmax(Float128), Float128(1)
julia> h + h
inf
julia> fma(x,x,x);
julia> h + h
1.18973149535723176508575932662800701e+4932
I don't see anything wrong with our wrappers; it may be a libquadmath and/or libgcc issue.
Is it correct for other values? Not sure if this is related: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89459
After calling fma on Float128, overflow seems generally broken for +,-,*,/ with Float128 or Float64 arguments (tested with mixtures of huge and moderate values). I surmise that some floating point control structure in the mingw runtime is not being reset properly.
Since things work on Linux, I don't think this one is an upstream GCC issue. (I believe I'm using a fairly old GCC on Linux without trouble here.)
ah, if it's breaking Float64 rounding, then I suspect that is related to https://github.com/JuliaLang/julia/issues/9847#issuecomment-71174579 / https://sourceware.org/bugzilla/show_bug.cgi?id=17907
In that case, maybe we should just disable fma on Windows? Or call the MPFR one?
I'm really here for the trouble it gave me in #30 and the challenge of a level-1 puzzle to pin that down, so either of your suggestions LGTM.
Maybe we just remove fma on Windows for the time being?
I have disabled fma on Windows for now. Once we have a better solution we can add it back.
Opinion?
@static if Sys.iswindows()
const SPLITTER = 1 + Float128(2)^57
const THRESHOLD = Float128(2)^16326
function two_sum(x::Float128, y::Float128)
hi = x + y
a = hi - y
b = hi - a
lo = (x - a) + (y - b)
return hi, lo
end
function sum_three(x::Float128, y::Float128, z::Float128)
s, t = two_sum(y, z)
u, v = two_sum(x, s)
hi = t + v + u
return hi
end
function splitter(x::Float128)
x < THRESHOLD || throw(DomainError("Threshold $THRESHOLD exceeded"))
a = x * SPLITTER
b = x - a
x1 = a + b
x2 = x - x1
return x1, x2
end
function two_prod(x::Float128, y::Float128)
hi = x * y
x1, x2 = splitter(x)
y1, y2 = splitter(y)
lo = x2 * y2 - (((hi - x1*y1) - x1*y2) - x2*y1)
return hi, lo
end
function fma(x::Float128, y::Float128, z::Float128)
xyhi, xylo = two_prod(x, y)
result = sum_three(z, xyhi, xylo)
return result
end
end
fmas are subtly difficult to get correct (especially dealing with underflow and subnormals). I think the BigFloat option is probably the best for now until we can figure out what is going on.
It's fine by me -- the more important aspect of this is let the Windows version work for clients who use fma. Why not implement the BigFloat fma stopgap until the cause is dissolved or resolved?
Thought it worth the revisit
if Sys.iswindows()
function Base.fma(x::Float128, y::Float128, z::Float128)
oldprec = precision(BigFloat)
setprecision(BigFloat, 512) # at least 448 == Int(128 * 3.5)
result = Float128(x * y + z)
setprecision(BigFloat, oldprec)
return result
end
else
# ccall
end
@JeffreySarnoff would you mind opening a pull request?
Isn't fma defined for BigFloat?
fma is defined for BigFloat and the logic will process fma(x,y,z) at the precision set for BigFloat. That precision may be well in excess of the precision demanded for Float128s and take longer than it might, or it may be much less than that and deliver an imprecise result. Either way, we need to manipulate setprecision(BigFloat) when fma is called (for Win). I'll PR it.
Has this "On Windows (but not Linux or OSX) calls to fma() somehow break the overflow handling." fixed itself with the current libquadmath that we use? If so, we should remove the code that blocks fma on Windows.
It appears to be unchanged for current and nightly Julia.
The example at the top is included as a broken test for Windows in the commit shown here: https://github.com/RalphAS/Quadmath.jl/actions/runs/2178209027