remove setrounding from functions implementation
Some functions do not work on 1.6 on windows. Particularly, I found the functions abs, mig, fma, sqrt . The core problem seems that they use setrounding. abs calls mig and mig and fma use directly setrounding (source code here ). sqrt I am not sure, this code does use set rounding, but the other functions defined there (inv, tanh etc.) do not fail. Here's the full stacktrace for reference
julia> sqrt(1..2)
ERROR: could not load symbol "fesetround":
The specified procedure could not be found.
Stacktrace:
[1] setrounding_raw
@ ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:45 [inlined]
[2] setrounding(#unused#::Type{Float64}, r::RoundingMode{:Down})
@ SetRounding ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:47
[3] setrounding(f::IntervalArithmetic.var"#45#46"{Float64}, #unused#::Type{Float64}, rounding::RoundingMode{:Down})
@ Base.Rounding .\rounding.jl:174
[4] sqrt
@ ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\rounding.jl:220 [inlined]
[5] sqrt
@ ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\rounding.jl:232 [inlined]
[6] sqrt
@ ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\rounding.jl:284 [inlined]
[7] sqrt(a::Interval{Float64})
@ IntervalArithmetic ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\functions.jl:233
[8] top-level scope
@ REPL[29]:1
julia> mig(1..2)
ERROR: could not load symbol "fesetround":
The specified procedure could not be found.
Stacktrace:
[1] setrounding_raw
@ ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:45 [inlined]
[2] setrounding(#unused#::Type{Float64}, r::RoundingMode{:Down})
@ SetRounding ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:47
[3] setrounding(f::IntervalArithmetic.var"#127#128"{Interval{Float64}}, #unused#::Type{Float64}, rounding::RoundingMode{:Down})
@ Base.Rounding .\rounding.jl:174
[4] mig(a::Interval{Float64})
@ IntervalArithmetic ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:297
[5] top-level scope
@ REPL[30]:1
julia> fma(1..2, 1..2, 1..2)
ERROR: could not load symbol "fesetround":
The specified procedure could not be found.
Stacktrace:
[1] setrounding_raw
@ ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:45 [inlined]
[2] setrounding(#unused#::Type{Float64}, r::RoundingMode{:Down})
@ SetRounding ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:47
[3] setrounding(f::IntervalArithmetic.var"#123#125"{Interval{Float64}, Interval{Float64}, Interval{Float64}}, #unused#::Type{Float64}, rounding::RoundingMode{:Down})
@ Base.Rounding .\rounding.jl:174
[4] fma(a::Interval{Float64}, b::Interval{Float64}, c::Interval{Float64})
@ IntervalArithmetic ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:264
[5] top-level scope
@ REPL[31]:1
julia> abs(1..2)
ERROR: could not load symbol "fesetround":
The specified procedure could not be found.
Stacktrace:
[1] setrounding_raw
@ ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:45 [inlined]
[2] setrounding(#unused#::Type{Float64}, r::RoundingMode{:Down})
@ SetRounding ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:47
[3] setrounding(f::IntervalArithmetic.var"#127#128"{Interval{Float64}}, #unused#::Type{Float64}, rounding::RoundingMode{:Down})
@ Base.Rounding .\rounding.jl:174
[4] mig
@ ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:297 [inlined]
[5] abs(a::Interval{Float64})
@ IntervalArithmetic ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:314
[6] top-level scope
@ REPL[32]:1
julia> tanh(1..2)
[0.761594, 0.964028]
julia> inv(1..2)
[0.5, 1]
At least for mig I don't see why setrounding is needed. The absolute value simply changes the sign bit, so no rounding error should be introduced?
julia> a = -sqrt(2)
-1.4142135623730951
julia> bitstring(a)
"1011111111110110101000001001111001100110011111110011101111001101"
julia> bitstring(abs(a))
"0011111111110110101000001001111001100110011111110011101111001101"
I think it could just be
function mig(a::Interval{T}) where T<:Real
isempty(a) && return convert(eltype(a), NaN)
zero(a.lo) โ a && return zero(a.lo)
min( abs(a.lo), abs(a.hi) )
end
or is there something I am missing?
for fma I think we do need direct rounding, however I am not sure what would be the best way to approach it. As far as I can tell, none of the macros such as @rounding, @rounding_down would work, since they seem to accept only unary and binary functions. Also I think those macros are only for functions supported by CRlibm and fma isn't? Maybe carrying out the scalar fma computations with BigFloat? This seems to introduce some computational overhead though
Also, looking at the current implementation of fma, I don't quite get why min_exclude_nans is used. The only case in which we can have NaNs is if one of the input intervals is [NaN, NaN] , but in that case all scalar fma computations will return NaN and hence e.g. min_exclude_nans will throw an error. Here's a demonstration in julia 1.5
julia> a = Interval(NaN, NaN)
[NaN, NaN]
julia> b = 1.1..2.2
[1.09999, 2.20001]
julia> fma(a, b, b)
ERROR: MethodError: no method matching min()
Closest candidates are:
min(::Missing, ::Missing) at missing.jl:124
min(::Missing, ::Any) at missing.jl:125
min(::BigFloat, ::BigFloat) at mpfr.jl:685
...
Stacktrace:
[1] min_ignore_nans at C:\Users\lucaa\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:239 [inlined]
[2] #123 at C:\Users\lucaa\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:269 [inlined]
[3] setrounding(::IntervalArithmetic.var"#123#125"{Interval{Float64},Interval{Float64},Interval{Float64}}, ::Type{Float64}, ::RoundingMode{:Down}) at .\rounding.jl:176
[4] fma(::Interval{Float64}, ::Interval{Float64}, ::Interval{Float64}) at C:\Users\lucaa\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:264
[5] top-level scope at REPL[35]:1
I think if any of the inputs is NaI, then also the final output should be NaI?
@JeffreySarnoff: Do you have any thoughts about directed rounding for fma?
I like it.
I need to play with fma and see if/how there is helpful, guidable meshing with [one of] our approaches to rounding. (tonight)
step 0: we need Julia to intervene, restoring our ability to choose, to set, to get IEEE floating point rounding modes
(that is available only with BigFloats cannot continue forward .. so much of our numerical table settings
that provide places for work and relaxation, and services that other persons use to great advantage ..
is in part reliant upon effective, efficient, performant, functions of floating point environment direction and
at computation time, least significant bits' rounding control that do the same in the same way cross-platform.
so please alert those who may think this is has been resolved -- let them know what you know of things now in the way
You can obtain for fma(x, y, z) the result rounded to nearest with nearest = fma(x, y, z) as RoundNearest is the intialization setting (though we really need to have the capability to ensure it is RoundNearest or other Rounding modes).
With the ability to wrap setrounding(Type, ..) as a block covering a calculation is probably the fastest way [or one of two fastest]
to obtain the corresponding roundup = fma(x,y,z) and rounddown = fma(x,y,z).
Absent that, one may use
function interval_fma(a::T, b::T, c::T) where {T<:Base.IEEEFloat}
hi, lo = two_fma(a, b, c)
if signbit(lo)
lo = prevfloat(hi)
elseif !zero(lo)
lo = hi
hi = nextfloat(hi)
else
lo = hi
end
return hi, lo
end
function two_fma(a::T, b::T, c::T) where {T<:Base.IEEEFloat}
hi = fma(a, b, c)
hi0, lo0 = two_prod(a, b)
hi1, lo1 = two_sum(c, lo0)
hi2, lo2 = two_sum(hi0, hi1)
lo = ((hi2 - hi) + lo2) + lo1
return hi, lo
end
"""
two_sum(a, b)
Computes `hi = fl(a+b)` and `lo = err(a+b)`.
"""
@inline function two_sum(a::T, b::T) where {T<:Base.IEEEFloat}
hi = a + b
v = hi - a
lo = (a - (hi - v)) + (b - v)
return hi, lo
end
"""
two_prod(a, b)
Computes `hi = fl(a*b)` and `lo = fl(err(a*b))`.
"""
@inline function two_prod(a::T, b::T) where {T<:Base.IEEEFloat}
hi = a * b
lo = fma(a, b, -hi)
hi, lo
end
(which approach is more performant depends on how performant or otherwise setting and resetting rounding happens to be, and specifics of the processor [can we run rounding control outside of the fpu and run the fmas in SIMD registers] )
The issue with fma, mig and sqrt seems to be the function setrounding_raw defined in SetRounding.jl, I think the problem is the same causing this issue. I don't know to what platform that issue refers, but based on the results of my latest PR here, it seems that the problem is from Julia 1.6 onwards and on windows only.
On the bright side, I tried the alternative proposed by JeffreySarnoff and it seems to work ๐ . I also did a small benchmarking with 1.5, where the original version with setrounding works, and they seem to be equivalent, Jeffrey's version is maybe a little faster. In the following snippet, fma is the original one and fma1 is Jeffrey's variant.
julia> a, b, c = 1.1..2.1, 1.2..2.2, 1.3..2.3
([1.09999, 2.10001], [1.19999, 2.20001], [1.29999, 2.30001])
julia> fma(a, b, c) === fma1(a, b, c)
true
julia> @btime fma($a, $b, $c)
557.297 ns (20 allocations: 512 bytes)
[2.61999, 6.92001]
julia> @btime fma1($a, $b, $c)
522.917 ns (20 allocations: 512 bytes)
[2.61999, 6.92001]
I don't have the necessary skills or knowledge to comment about pros and cons of using setrounding, but I follow the discussion with interest.
after the PR in SetRounding.jl this seems to be fixed now. I think fma shouldn't use min_ignore_nan (at least not in its current form) because 1) the current implementation returns an error if all its arguments are NaN 2) the current implementation seems to be type unstable on julia 1.6. However I guess this is out of the original scope of this issue.
Where is the current implementation type unstable?
the function min_ignore_nans (sorry I misspelled this in all previous comments) seems to be type unstable
julia> @code_warntype IntervalArithmetic.min_ignore_nans(1.1, 2.2, 3.3)
Variables
#self#::Core.Const(IntervalArithmetic.min_ignore_nans)
args::Tuple{Float64, Float64, Float64}
#119::IntervalArithmetic.var"#119#120"
Body::Any
1 โ %1 = Base.getproperty(IntervalArithmetic.Iterators, :filter)::Core.Const(Base.Iterators.filter)
โ (#119 = %new(IntervalArithmetic.:(var"#119#120")))
โ %3 = #119::Core.Const(IntervalArithmetic.var"#119#120"())
โ %4 = (%1)(%3, args)::Base.Iterators.Filter{IntervalArithmetic.var"#119#120", Tuple{Float64, Float64, Float64}}
โ %5 = Core._apply_iterate(Base.iterate, IntervalArithmetic.min, %4)::Any
โโโ return %5
while on v1.5
julia> @code_warntype IntervalArithmetic.min_ignore_nans(1.1, 2.2, 3.3)
Variables
#self#::Core.Compiler.Const(IntervalArithmetic.min_ignore_nans, false)
args::Tuple{Float64,Float64,Float64}
#119::IntervalArithmetic.var"#119#120"
Body::Float64
1 โ %1 = Base.getproperty(IntervalArithmetic.Iterators, :filter)::Core.Compiler.Const(Base.Iterators.filter, false)
โ (#119 = %new(IntervalArithmetic.:(var"#119#120")))
โ %3 = #119::Core.Compiler.Const(IntervalArithmetic.var"#119#120"(), false)
โ %4 = (%1)(%3, args)::Base.Iterators.Filter{IntervalArithmetic.var"#119#120",Tuple{Float64,Float64,Float64}}
โ %5 = Core._apply_iterate(Base.iterate, IntervalArithmetic.min, %4)::Float64
โโโ return %5
I think the instability is caused by the splatting inside min because the following function is identical to the current implementation, except that it uses minimum instead of min and seems to be type stable
julia> f(args...) = minimum(Iterators.filter(!isnan, args))
f (generic function with 2 methods)
julia> @code_warntype f(1.1, 2.2, 3.3)
Variables
#self#::Core.Const(f)
args::Tuple{Float64, Float64, Float64}
Body::Float64
1 โ %1 = Base.Iterators.filter::Core.Const(Base.Iterators.filter)
โ %2 = !Main.isnan::Core.Const(Base.var"#78#79"{typeof(isnan)}(isnan))
โ %3 = (%1)(%2, args)::Base.Iterators.Filter{Base.var"#78#79"{typeof(isnan)}, Tuple{Float64, Float64, Float64}}
โ %4 = Main.minimum(%3)::Float64
โโโ return %4
as a start, these more structured variants are type stable
function max_ignore_nans(args::Vector{T}) where {T<:AbstractFloat}
max(filter(x->!isnan(x), args)...)
end
function max_ignore_nans22(args::Vararg{T}) where {T}
max(filter(x->!isnan(x), args)...)
end
Great to have a fix for the type instability!
I still don't get why those functions are needed in the first place though. The only case in which one of the scalar fma can return a NaN is if one of more input intervals are [NaN, NaN], in which case all 4 scalars will be NaN and thus the min_ignore_nans will return an error, because trying to compute min with zero arguments.
julia> a = Interval(NaN, NaN)
[NaN, NaN]
julia> b = 1..2
[1, 2]
julia> fma(a, b, b)
ERROR: MethodError: no method matching min()
Closest candidates are:
min(::Dates.AbstractTime) at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Dates\src\types.jl:438
min(::DecoratedInterval{T}, ::DecoratedInterval{T}) where T at C:\Users\lucaa\.julia\packages\IntervalArithmetic\odPjL\src\decorations\functions.jl:216
min(::CartesianIndex{N}, ::CartesianIndex{N}) where N at multidimensional.jl:118
...
Stacktrace:
[1] min_ignore_nans
@ ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:239 [inlined]
[2] #123
@ ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:269 [inlined]
[3] setrounding(f::IntervalArithmetic.var"#123#125"{Interval{Float64}, Interval{Float64}, Interval{Float64}}, #unused#::Type{Float64}, rounding::RoundingMode{:Down})
@ Base.Rounding .\rounding.jl:176
[4] fma(a::Interval{Float64}, b::Interval{Float64}, c::Interval{Float64})
@ IntervalArithmetic ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:264
[5] top-level scope
@ REPL[62]:1
Could we simply check if some of the inputs are NaN at the beginning, return [NaN, NaN] if they are and use normal min and max in the computations?
if any is NaN the result is (NaN, NaN) a function guard is in order
@inline function nan_guard(a::T, b::T, c::T)
isnan(a+b+c)
end
function interval_fma(a::T, b::T, c::T)
nan_guard(a, b, c) && return (T(NaN), T(NaN))
# as before without the filtering of NaNs
end
@inline nan_guard(a::T, b::T, c::T) where {T} = isnan(a+b+c)
function interval_fma(a::T, b::T, c::T) where {T<:Base.IEEEFloat}
nan_guard(a, b, c) && return(T(NaN), T(NaN))
hi, lo = two_fma(a, b, c)
if signbit(lo)
lo = prevfloat(hi)
elseif !zero(lo)
lo = hi
hi = nextfloat(hi)
else
lo = hi
end
return hi, lo
end
function two_fma(a::T, b::T, c::T) where {T<:Base.IEEEFloat}
hi = fma(a, b, c)
hi0, lo0 = two_prod(a, b)
hi1, lo1 = two_sum(c, lo0)
hi2, lo2 = two_sum(hi0, hi1)
lo = ((hi2 - hi) + lo2) + lo1
return hi, lo
end
"""
two_sum(a, b)
Computes `hi = fl(a+b)` and `lo = err(a+b)`.
"""
@inline function two_sum(a::T, b::T) where {T<:Base.IEEEFloat}
hi = a + b
v = hi - a
lo = (a - (hi - v)) + (b - v)
return hi, lo
end
"""
two_prod(a, b)
Computes `hi = fl(a*b)` and `lo = fl(err(a*b))`.
"""
@inline function two_prod(a::T, b::T) where {T<:Base.IEEEFloat}
hi = a * b
lo = fma(a, b, -hi)
hi, lo
end
Now that SetRounding.jl is fixed, do we still need the manual directed rounding or can we simply use something like:
function fma(a::Interval{T}, b::Interval{T}, c::Interval{T}) where T
#T = promote_type(eltype(a), eltype(b), eltype(c))
(isempty(a) || isempty(b) || isempty(c)) && return emptyinterval(T)
(isnan(a) || isnan(b) || isnan(c)) && return a+b+c
if isentire(a)
b == zero(b) && return c
return entireinterval(T)
elseif isentire(b)
a == zero(a) && return c
return entireinterval(T)
end
lo = setrounding(T, RoundDown) do
lo1 = fma(a.lo, b.lo, c.lo)
lo2 = fma(a.lo, b.hi, c.lo)
lo3 = fma(a.hi, b.lo, c.lo)
lo4 = fma(a.hi, b.hi, c.lo)
min(lo1, lo2, lo3, lo4)
end
hi = setrounding(T, RoundUp) do
hi1 = fma(a.lo, b.lo, c.hi)
hi2 = fma(a.lo, b.hi, c.hi)
hi3 = fma(a.hi, b.lo, c.hi)
hi4 = fma(a.hi, b.hi, c.hi)
max(hi1, hi2, hi3, hi4)
end
Interval(lo, hi)
end
this is a slightly modified version of the current one, using normal min and max and checking for NaI at the beginning of the function.ยจ
Not sure how this compares performancewise to the manual directed rounding, I can try to craft a benchmark on my machine.
I am putting together some comparisons too. What is happening here? What is the correct intuition?
julia> a=Interval(sin(0.5))
[0.479425, 0.479426]
julia> a.lo==a.hi
true
I had expected
julia> using SetRounding
julia> setrounding(Float64, RoundDown); lo=sin(0.5); setrounding(Float64, RoundUp); hi=sin(0.5);
julia> lo,hi
(0.47942553860420295, 0.479425538604203)
julia> a=Interval(lo,hi)
[0.479425, 0.479426]
julia> a.lo == a.hi
false
The Interval constructor and the interval method do not perform directed rounding, they merely assign the given input numbers to the attributes of the interval. The "smart" interval which performs directed rounding is obtained with the @interval macro or the .. method
julia> a = @interval sin(0.5)
[0.479425, 0.479426]
julia> a.lo == a.hi
false
julia> using SetRounding
julia> setrounding(Float64, RoundDown); lo = sin(0.5); setrounding(Float64, RoundUp); hi = sin(0.5); setrounding(Float64, RoundNearest)
0
julia> lo == a.lo
true
julia> hi == a.hi
true
why does
julia> a=Interval(sin(0.5))
[0.479425, 0.479426]
julia> a.lo==a.hi
true
show [0.479425, 0.479426] rather than [0.479426, 0.479426]
as a.lo==a.hi==0.479_425_538_604_203
It seems the display function automatically rounds down the lower bound and rounds up the upper bound to the chosen number of digits (6 by default), regardless of whether the lower and upper bound are the same https://github.com/JuliaIntervals/IntervalArithmetic.jl/blob/master/src/display.jl#L190-L191
If only @interval or .. or were used to construct intervals, then I think this wouldn't be an issue. If the interval is created with interval or Interval then the printing can be misleading in some situations I guess. cc @dpsanders
I think we should avoid setrounding completely.
Yes we should probably fix the display to it doesn't round if that's not necessary, if it's possible to do so.
There is code in display.jl that rounds the value after converting it to a string. That is a very powerful approach, and essential to the philosophical integrity of ArbFloats.jl (to show the value that is least misleading, the most veridical representation).
The code here is not trying to do that, so it should not round strings. Using round(x, digits=ndigits, base=2) or ..base=10) will serve better, and when both lo and hi are equal, just round one of them and use that for both in the display). Sometimes the string that results runs _9999999996 or other oddity, so the time convert to a string and round the last digit to be shown (rippling up as necessary) is after the numeric, digit bounded rounding.
I set up a small benchmark candidates are
fma with setrounding
function fma(a::Interval{T}, b::Interval{T}, c::Interval{T}) where T
#T = promote_type(eltype(a), eltype(b), eltype(c))
(isempty(a) || isempty(b) || isempty(c)) && return emptyinterval(T)
isnan(a+b+c) && return a + b + c
if isentire(a)
b == zero(b) && return c
return entireinterval(T)
elseif isentire(b)
a == zero(a) && return c
return entireinterval(T)
end
lo = setrounding(T, RoundDown) do
lo1 = fma(a.lo, b.lo, c.lo)
lo2 = fma(a.lo, b.hi, c.lo)
lo3 = fma(a.hi, b.lo, c.lo)
lo4 = fma(a.hi, b.hi, c.lo)
min(lo1, lo2, lo3, lo4)
end
hi = setrounding(T, RoundUp) do
hi1 = fma(a.lo, b.lo, c.hi)
hi2 = fma(a.lo, b.hi, c.hi)
hi3 = fma(a.hi, b.lo, c.hi)
hi4 = fma(a.hi, b.hi, c.hi)
max(hi1, hi2, hi3, hi4)
end
Interval(lo, hi)
end
and fma1 using manual directed rounding and the helping functions as defined above
function fma1(a::Interval{T}, b::Interval{T}, c::Interval{T}) where T
(isempty(a) || isempty(b) || isempty(c)) && return emptyinterval(T)
isnan(a+b+c) && return a + b + c
if isentire(a)
b == zero(b) && return c
return entireinterval(T)
elseif isentire(b)
a == zero(a) && return c
return entireinterval(T)
end
_, lo1 = fma_interval(a.lo, b.lo, c.lo)
_, lo2 = fma_interval(a.lo, b.hi, c.lo)
_, lo3 = fma_interval(a.hi, b.lo, c.lo)
_, lo4 = fma_interval(a.hi, b.hi, c.lo)
lo = min(lo1, lo2, lo3, lo4)
hi1, _ = fma_interval(a.lo, b.lo, c.hi)
hi2, _ = fma_interval(a.lo, b.hi, c.hi)
hi3, _ = fma_interval(a.hi, b.lo, c.hi)
hi4, _ = fma_interval(a.hi, b.hi, c.hi)
hi = max(hi1, hi2, hi3, hi4)
return Interval(lo, hi)
end
and the benchmarking
julia> a = b = c = 1.1..1.2
[1.09999, 1.20001]
julia> @btime fma($(Ref(a))[], $(Ref(b))[], $(Ref(c))[])
1.018 ฮผs (0 allocations: 0 bytes)
[2.30999, 2.64001]
julia> @btime fma1($(Ref(a))[], $(Ref(b))[], $(Ref(c))[])
61.071 ns (0 allocations: 0 bytes)
[2.30999, 2.64001]
so fma1 is also faster in addition to not using setrounding
Wow, that's really much faster. That's why pretty convincing!
15x is 10 times faster than 1.5x, which is my waterline for "worth the effort" I suppose fma1 is well worth the, apparently 1/10th effort, we expended. ๐
The version with manual directed rounded apparently fails for "half unbounded intervals", e.g.
julia> a = 1..2
[1, 2]
julia> b = -โ..3
[-โ, 3]
julia> c = 5..โ
[5, โ]
julia> fma(a, b, c)
[-โ, NaN]
Incidentally, now filtering out the NaN before taking the maximum/minimum fixes the problem (in the sense that all tests pass) ๐ . I replaced lo = min(lo1, lo2, lo3, lo4) with minimum(filter(!isnan, [lo1, lo2, lo3, lo4])) and max similarly and the new benchmark is
julia> a = b = c = 1.1..1.2
[1.09999, 1.20001]
julia> @btime fma($(Ref(a))[], $(Ref(b))[], $(Ref(c))[])
228.507 ns (4 allocations: 448 bytes)
[2.30999, 2.64001]
so still faster than the original one but quite a loss in efficiency. I think the problem is that Inf - Inf = NaN and thus the help functions (like two_sum, two_fma ) can return a NaN for infinite input. I wonder whether these edge cases can be taken into account without sacrificing efficiency ๐ค
Why not do something similar to
function FMA(a::T, b::T, c::T) where T
fma_hi = fma(a, b, c)
isnan(fma_hi) && return (T(NaN), T(NaN))
isinf(fma_hi) && return (fma_hi, fma_hi)
....
end
When I have used this pattern in the past, I wrote a fast binary-based isnotfinite(fma_hi) and only checked
isnan(fma_hi), isinf(fma_hi) when it had been determined that isnotfinite(fma_hi) (which happens rarely,
so having the one conditional is much better than always running through two conditionals).
Julia's !isfinite(x) has improved since then. It is faster to test !isfinite than to test isfinite.
Ok I think I figured out how to avoid filtering NaNs. Here is a running example to explain what I mean. The problem is that if we want to compute e.g. fma([1, 2], [-โ, 3], [2, โ]) then the four candidates for the upper bound are
h1 = fma(1, -โ, โ) = NaN
h2 = fma(2, -โ, โ) = NaN
h3 = fma(1, 3, โ) = โ
h4 = fma(2, 3, โ) = โ
and the first two poison the computation of the maximum and hence the need for max/min_ignore_nans. Followinf Jeffrey's suggestions I modified the function hi, lo = interval_fma(a, b, c), which returns the rounded up (hi) and rounded down (lo) versions of fma as follows:
function interval_fma(a::T, b::T, c::T) where {T}
hi = fma(a, b, c)
isnan(hi) && return convert(T, -Inf), convert(T, Inf) # first is hi, second is lo
!isfinite(hi) && return hi, hi
hi, lo = two_fma(a, b, c)
# and then like before
i.e. if the result of the scalar fma is NaN, it returns -Inf as upper bound and Inf as lower bound. This way those are automatically filtered out when taking the maximum of upper bounds and the minimum of lower bounds, because the only case in which min can return Inf is if all its inputs are Inf, which would mean that all 4 candidates would have returned NaN, which can happen only if some of the original input intervals were NaI or empty intervals or entire intervals, but these cases are already checked at the beginning of fma(a::Interval{T}, b::Interval{T}, c::Interval{T}).
Summa summarum, now all tests pass (the ITF1788 test suite has at least 108 tests checking those half unbounded intervals cases, so I guess they are quite exhaustive), no need for filtering NaN out and the benchmark is
julia> a = b = c = 1.1..1.2
[1.09999, 1.20001]
julia> @btime fma($(Ref(a))[], $(Ref(b))[], $(Ref(c))[])
46.255 ns (0 allocations: 0 bytes)
[2.30999, 2.64001]
as a final small side note, at the beginning of fma(a::Interval{T}, b::Interval{T}, c::Interval{T}) I changed
isnan(a+b+c) && return a + b + c
to
(isnan(a) || isnan(b) || isnan(c)) && return a + b + c
squeezing a little more efficiency, now the benchmark is
julia> @btime fma($(Ref(a))[], $(Ref(b))[], $(Ref(c))[])
35.549 ns (0 allocations: 0 bytes)
[2.30999, 2.64001]
I will soon polish the code and push it to the PR of this issue ( #443 ), comments, feedback and improvement suggestions are always welcome of course.
With the benchmarking I did this morning
(isnan(a) || isnan(b) || isnan(c)) && return a + b + c
is sometimes less performant than each of these
isnan(a+b+c) && return a+b+c
isnan(fma(a,b,c)) && return fma(a,b,c)
(it goes so quickly that when none are NaN, the one-off timings are indistinguishable) Where code use brings issues of overlap and latency to the fore, the use of || is not as performant because
I displayed @code_native for each of the three, and then removed the instructions they all shared
The version with '||' is the only one that has an internal jump (better avoided on general principles).
Are your benchmarks using real-world code or just comparing these routines?
If the latter, I am less persuaded. If the former, disregard this note.