IntervalArithmetic.jl icon indicating copy to clipboard operation
IntervalArithmetic.jl copied to clipboard

remove setrounding from functions implementation

Open lucaferranti opened this issue 4 years ago โ€ข 31 comments

Some functions do not work on 1.6 on windows. Particularly, I found the functions abs, mig, fma, sqrt . The core problem seems that they use setrounding. abs calls mig and mig and fma use directly setrounding (source code here ). sqrt I am not sure, this code does use set rounding, but the other functions defined there (inv, tanh etc.) do not fail. Here's the full stacktrace for reference

julia> sqrt(1..2)
ERROR: could not load symbol "fesetround":
The specified procedure could not be found.
Stacktrace:
 [1] setrounding_raw
   @ ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:45 [inlined]
 [2] setrounding(#unused#::Type{Float64}, r::RoundingMode{:Down})
   @ SetRounding ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:47
 [3] setrounding(f::IntervalArithmetic.var"#45#46"{Float64}, #unused#::Type{Float64}, rounding::RoundingMode{:Down})
   @ Base.Rounding .\rounding.jl:174
 [4] sqrt
   @ ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\rounding.jl:220 [inlined]
 [5] sqrt
   @ ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\rounding.jl:232 [inlined]
 [6] sqrt
   @ ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\rounding.jl:284 [inlined]
 [7] sqrt(a::Interval{Float64})
   @ IntervalArithmetic ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\functions.jl:233
 [8] top-level scope
   @ REPL[29]:1

julia> mig(1..2)
ERROR: could not load symbol "fesetround":
The specified procedure could not be found.
Stacktrace:
 [1] setrounding_raw
   @ ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:45 [inlined]
 [2] setrounding(#unused#::Type{Float64}, r::RoundingMode{:Down})
   @ SetRounding ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:47
 [3] setrounding(f::IntervalArithmetic.var"#127#128"{Interval{Float64}}, #unused#::Type{Float64}, rounding::RoundingMode{:Down})
   @ Base.Rounding .\rounding.jl:174
 [4] mig(a::Interval{Float64})
   @ IntervalArithmetic ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:297
 [5] top-level scope
   @ REPL[30]:1

julia> fma(1..2, 1..2, 1..2)
ERROR: could not load symbol "fesetround":
The specified procedure could not be found.
Stacktrace:
 [1] setrounding_raw
   @ ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:45 [inlined]
 [2] setrounding(#unused#::Type{Float64}, r::RoundingMode{:Down})
   @ SetRounding ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:47
 [3] setrounding(f::IntervalArithmetic.var"#123#125"{Interval{Float64}, Interval{Float64}, Interval{Float64}}, #unused#::Type{Float64}, rounding::RoundingMode{:Down})
   @ Base.Rounding .\rounding.jl:174
 [4] fma(a::Interval{Float64}, b::Interval{Float64}, c::Interval{Float64})
   @ IntervalArithmetic ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:264
 [5] top-level scope
   @ REPL[31]:1

julia> abs(1..2)
ERROR: could not load symbol "fesetround":
The specified procedure could not be found.
Stacktrace:
 [1] setrounding_raw
   @ ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:45 [inlined]
 [2] setrounding(#unused#::Type{Float64}, r::RoundingMode{:Down})
   @ SetRounding ~\.julia\packages\SetRounding\MjtxK\src\SetRounding.jl:47
 [3] setrounding(f::IntervalArithmetic.var"#127#128"{Interval{Float64}}, #unused#::Type{Float64}, rounding::RoundingMode{:Down})
   @ Base.Rounding .\rounding.jl:174
 [4] mig
   @ ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:297 [inlined]
 [5] abs(a::Interval{Float64})
   @ IntervalArithmetic ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:314
 [6] top-level scope
   @ REPL[32]:1

julia> tanh(1..2)
[0.761594, 0.964028]

julia> inv(1..2)
[0.5, 1]

lucaferranti avatar Mar 06 '21 21:03 lucaferranti

At least for mig I don't see why setrounding is needed. The absolute value simply changes the sign bit, so no rounding error should be introduced?

julia> a = -sqrt(2)
-1.4142135623730951

julia> bitstring(a)
"1011111111110110101000001001111001100110011111110011101111001101"

julia> bitstring(abs(a))
"0011111111110110101000001001111001100110011111110011101111001101"

I think it could just be

function mig(a::Interval{T}) where T<:Real
    isempty(a) && return convert(eltype(a), NaN)
    zero(a.lo) โˆˆ a && return zero(a.lo)
    min( abs(a.lo), abs(a.hi) )
end

or is there something I am missing?

lucaferranti avatar Mar 07 '21 09:03 lucaferranti

for fma I think we do need direct rounding, however I am not sure what would be the best way to approach it. As far as I can tell, none of the macros such as @rounding, @rounding_down would work, since they seem to accept only unary and binary functions. Also I think those macros are only for functions supported by CRlibm and fma isn't? Maybe carrying out the scalar fma computations with BigFloat? This seems to introduce some computational overhead though

lucaferranti avatar Mar 09 '21 17:03 lucaferranti

Also, looking at the current implementation of fma, I don't quite get why min_exclude_nans is used. The only case in which we can have NaNs is if one of the input intervals is [NaN, NaN] , but in that case all scalar fma computations will return NaN and hence e.g. min_exclude_nans will throw an error. Here's a demonstration in julia 1.5

julia> a = Interval(NaN, NaN)
[NaN, NaN]

julia> b = 1.1..2.2
[1.09999, 2.20001]

julia> fma(a, b, b)
ERROR: MethodError: no method matching min()
Closest candidates are:
  min(::Missing, ::Missing) at missing.jl:124
  min(::Missing, ::Any) at missing.jl:125
  min(::BigFloat, ::BigFloat) at mpfr.jl:685
  ...
Stacktrace:
 [1] min_ignore_nans at C:\Users\lucaa\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:239 [inlined]
 [2] #123 at C:\Users\lucaa\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:269 [inlined]
 [3] setrounding(::IntervalArithmetic.var"#123#125"{Interval{Float64},Interval{Float64},Interval{Float64}}, ::Type{Float64}, ::RoundingMode{:Down}) at .\rounding.jl:176
 [4] fma(::Interval{Float64}, ::Interval{Float64}, ::Interval{Float64}) at C:\Users\lucaa\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:264
 [5] top-level scope at REPL[35]:1

I think if any of the inputs is NaI, then also the final output should be NaI?

lucaferranti avatar Mar 09 '21 18:03 lucaferranti

@JeffreySarnoff: Do you have any thoughts about directed rounding for fma?

dpsanders avatar Mar 09 '21 18:03 dpsanders

I like it.

JeffreySarnoff avatar Mar 09 '21 18:03 JeffreySarnoff

I need to play with fma and see if/how there is helpful, guidable meshing with [one of] our approaches to rounding. (tonight)

JeffreySarnoff avatar Mar 09 '21 18:03 JeffreySarnoff

step 0: we need Julia to intervene, restoring our ability to choose, to set, to get IEEE floating point rounding modes (that is available only with BigFloats cannot continue forward .. so much of our numerical table settings that provide places for work and relaxation, and services that other persons use to great advantage .. is in part reliant upon effective, efficient, performant, functions of floating point environment direction and at computation time, least significant bits' rounding control that do the same in the same way cross-platform.

so please alert those who may think this is has been resolved -- let them know what you know of things now in the way

You can obtain for fma(x, y, z) the result rounded to nearest with nearest = fma(x, y, z) as RoundNearest is the intialization setting (though we really need to have the capability to ensure it is RoundNearest or other Rounding modes). With the ability to wrap setrounding(Type, ..) as a block covering a calculation is probably the fastest way [or one of two fastest]
to obtain the corresponding roundup = fma(x,y,z) and rounddown = fma(x,y,z).

Absent that, one may use

function interval_fma(a::T, b::T, c::T) where {T<:Base.IEEEFloat}
  hi, lo = two_fma(a, b, c)
  if signbit(lo)
    lo = prevfloat(hi)
  elseif !zero(lo)
    lo = hi
    hi = nextfloat(hi)
  else 
    lo = hi
  end
  return hi, lo
end

function two_fma(a::T, b::T, c::T) where {T<:Base.IEEEFloat}
    hi = fma(a, b, c)
    hi0, lo0 = two_prod(a, b)
    hi1, lo1 = two_sum(c, lo0)
    hi2, lo2 = two_sum(hi0, hi1)
    lo = ((hi2 - hi) + lo2) + lo1
    return hi, lo
end

"""
    two_sum(a, b)

Computes `hi = fl(a+b)` and `lo = err(a+b)`.
"""
@inline function two_sum(a::T, b::T) where {T<:Base.IEEEFloat}
    hi = a + b
    v  = hi - a
    lo = (a - (hi - v)) + (b - v)
    return hi, lo
end

"""
    two_prod(a, b)

Computes `hi = fl(a*b)` and `lo = fl(err(a*b))`.
"""
@inline function two_prod(a::T, b::T) where {T<:Base.IEEEFloat}
    hi = a * b
    lo = fma(a, b, -hi)
    hi, lo
end

JeffreySarnoff avatar Mar 10 '21 07:03 JeffreySarnoff

(which approach is more performant depends on how performant or otherwise setting and resetting rounding happens to be, and specifics of the processor [can we run rounding control outside of the fpu and run the fmas in SIMD registers] )

JeffreySarnoff avatar Mar 10 '21 07:03 JeffreySarnoff

The issue with fma, mig and sqrt seems to be the function setrounding_raw defined in SetRounding.jl, I think the problem is the same causing this issue. I don't know to what platform that issue refers, but based on the results of my latest PR here, it seems that the problem is from Julia 1.6 onwards and on windows only.

On the bright side, I tried the alternative proposed by JeffreySarnoff and it seems to work ๐ŸŽ‰ . I also did a small benchmarking with 1.5, where the original version with setrounding works, and they seem to be equivalent, Jeffrey's version is maybe a little faster. In the following snippet, fma is the original one and fma1 is Jeffrey's variant.

 julia> a, b, c = 1.1..2.1, 1.2..2.2, 1.3..2.3
([1.09999, 2.10001], [1.19999, 2.20001], [1.29999, 2.30001])

julia> fma(a, b, c) === fma1(a, b, c)
true

julia> @btime fma($a, $b, $c)
  557.297 ns (20 allocations: 512 bytes)
[2.61999, 6.92001]

julia> @btime fma1($a, $b, $c)
  522.917 ns (20 allocations: 512 bytes)
[2.61999, 6.92001]

I don't have the necessary skills or knowledge to comment about pros and cons of using setrounding, but I follow the discussion with interest.

lucaferranti avatar Mar 10 '21 14:03 lucaferranti

after the PR in SetRounding.jl this seems to be fixed now. I think fma shouldn't use min_ignore_nan (at least not in its current form) because 1) the current implementation returns an error if all its arguments are NaN 2) the current implementation seems to be type unstable on julia 1.6. However I guess this is out of the original scope of this issue.

lucaferranti avatar Mar 19 '21 13:03 lucaferranti

Where is the current implementation type unstable?

JeffreySarnoff avatar Mar 19 '21 13:03 JeffreySarnoff

the function min_ignore_nans (sorry I misspelled this in all previous comments) seems to be type unstable

julia> @code_warntype IntervalArithmetic.min_ignore_nans(1.1, 2.2, 3.3)
Variables
  #self#::Core.Const(IntervalArithmetic.min_ignore_nans)
  args::Tuple{Float64, Float64, Float64}
  #119::IntervalArithmetic.var"#119#120"

Body::Any
1 โ”€ %1 = Base.getproperty(IntervalArithmetic.Iterators, :filter)::Core.Const(Base.Iterators.filter)
โ”‚        (#119 = %new(IntervalArithmetic.:(var"#119#120")))
โ”‚   %3 = #119::Core.Const(IntervalArithmetic.var"#119#120"())
โ”‚   %4 = (%1)(%3, args)::Base.Iterators.Filter{IntervalArithmetic.var"#119#120", Tuple{Float64, Float64, Float64}}
โ”‚   %5 = Core._apply_iterate(Base.iterate, IntervalArithmetic.min, %4)::Any
โ””โ”€โ”€      return %5

while on v1.5

julia> @code_warntype IntervalArithmetic.min_ignore_nans(1.1, 2.2, 3.3)
Variables
  #self#::Core.Compiler.Const(IntervalArithmetic.min_ignore_nans, false)
  args::Tuple{Float64,Float64,Float64}
  #119::IntervalArithmetic.var"#119#120"

Body::Float64
1 โ”€ %1 = Base.getproperty(IntervalArithmetic.Iterators, :filter)::Core.Compiler.Const(Base.Iterators.filter, false)
โ”‚        (#119 = %new(IntervalArithmetic.:(var"#119#120")))
โ”‚   %3 = #119::Core.Compiler.Const(IntervalArithmetic.var"#119#120"(), false)
โ”‚   %4 = (%1)(%3, args)::Base.Iterators.Filter{IntervalArithmetic.var"#119#120",Tuple{Float64,Float64,Float64}}
โ”‚   %5 = Core._apply_iterate(Base.iterate, IntervalArithmetic.min, %4)::Float64
โ””โ”€โ”€      return %5

I think the instability is caused by the splatting inside min because the following function is identical to the current implementation, except that it uses minimum instead of min and seems to be type stable

julia> f(args...) = minimum(Iterators.filter(!isnan, args))
f (generic function with 2 methods)

julia> @code_warntype f(1.1, 2.2, 3.3)
Variables
  #self#::Core.Const(f)
  args::Tuple{Float64, Float64, Float64}

Body::Float64
1 โ”€ %1 = Base.Iterators.filter::Core.Const(Base.Iterators.filter)
โ”‚   %2 = !Main.isnan::Core.Const(Base.var"#78#79"{typeof(isnan)}(isnan))
โ”‚   %3 = (%1)(%2, args)::Base.Iterators.Filter{Base.var"#78#79"{typeof(isnan)}, Tuple{Float64, Float64, Float64}}
โ”‚   %4 = Main.minimum(%3)::Float64
โ””โ”€โ”€      return %4

lucaferranti avatar Mar 19 '21 13:03 lucaferranti

as a start, these more structured variants are type stable

function max_ignore_nans(args::Vector{T}) where {T<:AbstractFloat}
    max(filter(x->!isnan(x), args)...)
end

function max_ignore_nans22(args::Vararg{T}) where {T}
    max(filter(x->!isnan(x), args)...)
end

JeffreySarnoff avatar Mar 19 '21 14:03 JeffreySarnoff

Great to have a fix for the type instability! I still don't get why those functions are needed in the first place though. The only case in which one of the scalar fma can return a NaN is if one of more input intervals are [NaN, NaN], in which case all 4 scalars will be NaN and thus the min_ignore_nans will return an error, because trying to compute min with zero arguments.

julia> a = Interval(NaN, NaN)
[NaN, NaN]

julia> b = 1..2
[1, 2]

julia> fma(a, b, b)
ERROR: MethodError: no method matching min()
Closest candidates are:
  min(::Dates.AbstractTime) at C:\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.6\Dates\src\types.jl:438
  min(::DecoratedInterval{T}, ::DecoratedInterval{T}) where T at C:\Users\lucaa\.julia\packages\IntervalArithmetic\odPjL\src\decorations\functions.jl:216
  min(::CartesianIndex{N}, ::CartesianIndex{N}) where N at multidimensional.jl:118
  ...
Stacktrace:
 [1] min_ignore_nans
   @ ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:239 [inlined]
 [2] #123
   @ ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:269 [inlined]
 [3] setrounding(f::IntervalArithmetic.var"#123#125"{Interval{Float64}, Interval{Float64}, Interval{Float64}}, #unused#::Type{Float64}, rounding::RoundingMode{:Down})
   @ Base.Rounding .\rounding.jl:176
 [4] fma(a::Interval{Float64}, b::Interval{Float64}, c::Interval{Float64})
   @ IntervalArithmetic ~\.julia\packages\IntervalArithmetic\odPjL\src\intervals\arithmetic.jl:264
 [5] top-level scope
   @ REPL[62]:1

Could we simply check if some of the inputs are NaN at the beginning, return [NaN, NaN] if they are and use normal min and max in the computations?

lucaferranti avatar Mar 19 '21 14:03 lucaferranti

if any is NaN the result is (NaN, NaN) a function guard is in order

@inline function nan_guard(a::T, b::T, c::T)
    isnan(a+b+c) 
end

function interval_fma(a::T, b::T, c::T)
    nan_guard(a, b, c) && return (T(NaN), T(NaN))
    # as before without the filtering of NaNs
end

JeffreySarnoff avatar Mar 19 '21 15:03 JeffreySarnoff

@inline nan_guard(a::T, b::T, c::T) where {T} = isnan(a+b+c) 

function interval_fma(a::T, b::T, c::T) where {T<:Base.IEEEFloat}
  nan_guard(a, b, c) && return(T(NaN), T(NaN))

  hi, lo = two_fma(a, b, c)
  if signbit(lo)
    lo = prevfloat(hi)
  elseif !zero(lo)
    lo = hi
    hi = nextfloat(hi)
  else 
    lo = hi
  end
  return hi, lo
end

function two_fma(a::T, b::T, c::T) where {T<:Base.IEEEFloat}
    hi = fma(a, b, c)
    hi0, lo0 = two_prod(a, b)
    hi1, lo1 = two_sum(c, lo0)
    hi2, lo2 = two_sum(hi0, hi1)
    lo = ((hi2 - hi) + lo2) + lo1
    return hi, lo
end

"""
    two_sum(a, b)

Computes `hi = fl(a+b)` and `lo = err(a+b)`.
"""
@inline function two_sum(a::T, b::T) where {T<:Base.IEEEFloat}
    hi = a + b
    v  = hi - a
    lo = (a - (hi - v)) + (b - v)
    return hi, lo
end

"""
    two_prod(a, b)

Computes `hi = fl(a*b)` and `lo = fl(err(a*b))`.
"""
@inline function two_prod(a::T, b::T) where {T<:Base.IEEEFloat}
    hi = a * b
    lo = fma(a, b, -hi)
    hi, lo
end

JeffreySarnoff avatar Mar 27 '21 11:03 JeffreySarnoff

Now that SetRounding.jl is fixed, do we still need the manual directed rounding or can we simply use something like:

function fma(a::Interval{T}, b::Interval{T}, c::Interval{T}) where T
    #T = promote_type(eltype(a), eltype(b), eltype(c))

    (isempty(a) || isempty(b) || isempty(c)) && return emptyinterval(T)
    (isnan(a) || isnan(b) || isnan(c)) && return a+b+c

    if isentire(a)
        b == zero(b) && return c
        return entireinterval(T)

    elseif isentire(b)
        a == zero(a) && return c
        return entireinterval(T)

    end

    lo = setrounding(T, RoundDown) do
        lo1 = fma(a.lo, b.lo, c.lo)
        lo2 = fma(a.lo, b.hi, c.lo)
        lo3 = fma(a.hi, b.lo, c.lo)
        lo4 = fma(a.hi, b.hi, c.lo)
        min(lo1, lo2, lo3, lo4)
    end

    hi = setrounding(T, RoundUp) do
        hi1 = fma(a.lo, b.lo, c.hi)
        hi2 = fma(a.lo, b.hi, c.hi)
        hi3 = fma(a.hi, b.lo, c.hi)
        hi4 = fma(a.hi, b.hi, c.hi)
        max(hi1, hi2, hi3, hi4)
    end

    Interval(lo, hi)
end

this is a slightly modified version of the current one, using normal min and max and checking for NaI at the beginning of the function.ยจ

Not sure how this compares performancewise to the manual directed rounding, I can try to craft a benchmark on my machine.

lucaferranti avatar Mar 27 '21 13:03 lucaferranti

I am putting together some comparisons too. What is happening here? What is the correct intuition?

julia> a=Interval(sin(0.5))
[0.479425, 0.479426]

julia> a.lo==a.hi
true

I had expected



julia> using SetRounding

julia> setrounding(Float64, RoundDown); lo=sin(0.5); setrounding(Float64, RoundUp); hi=sin(0.5);

julia> lo,hi
(0.47942553860420295, 0.479425538604203)

julia> a=Interval(lo,hi)
[0.479425, 0.479426]

julia> a.lo == a.hi
false

JeffreySarnoff avatar Mar 27 '21 13:03 JeffreySarnoff

The Interval constructor and the interval method do not perform directed rounding, they merely assign the given input numbers to the attributes of the interval. The "smart" interval which performs directed rounding is obtained with the @interval macro or the .. method

julia> a = @interval sin(0.5)
[0.479425, 0.479426]

julia> a.lo == a.hi
false

julia> using SetRounding

julia> setrounding(Float64, RoundDown); lo = sin(0.5); setrounding(Float64, RoundUp); hi = sin(0.5); setrounding(Float64, RoundNearest)
0

julia> lo == a.lo
true

julia> hi == a.hi
true

lucaferranti avatar Mar 27 '21 13:03 lucaferranti

why does

julia> a=Interval(sin(0.5))
[0.479425, 0.479426]

julia> a.lo==a.hi
true

show [0.479425, 0.479426] rather than [0.479426, 0.479426] as a.lo==a.hi==0.479_425_538_604_203

JeffreySarnoff avatar Mar 27 '21 13:03 JeffreySarnoff

It seems the display function automatically rounds down the lower bound and rounds up the upper bound to the chosen number of digits (6 by default), regardless of whether the lower and upper bound are the same https://github.com/JuliaIntervals/IntervalArithmetic.jl/blob/master/src/display.jl#L190-L191 If only @interval or .. or were used to construct intervals, then I think this wouldn't be an issue. If the interval is created with interval or Interval then the printing can be misleading in some situations I guess. cc @dpsanders

lucaferranti avatar Mar 27 '21 14:03 lucaferranti

I think we should avoid setrounding completely.

Yes we should probably fix the display to it doesn't round if that's not necessary, if it's possible to do so.

dpsanders avatar Mar 27 '21 14:03 dpsanders

There is code in display.jl that rounds the value after converting it to a string. That is a very powerful approach, and essential to the philosophical integrity of ArbFloats.jl (to show the value that is least misleading, the most veridical representation).

The code here is not trying to do that, so it should not round strings. Using round(x, digits=ndigits, base=2) or ..base=10) will serve better, and when both lo and hi are equal, just round one of them and use that for both in the display). Sometimes the string that results runs _9999999996 or other oddity, so the time convert to a string and round the last digit to be shown (rippling up as necessary) is after the numeric, digit bounded rounding.

JeffreySarnoff avatar Mar 27 '21 15:03 JeffreySarnoff

I set up a small benchmark candidates are

fma with setrounding

function fma(a::Interval{T}, b::Interval{T}, c::Interval{T}) where T
    #T = promote_type(eltype(a), eltype(b), eltype(c))

    (isempty(a) || isempty(b) || isempty(c)) && return emptyinterval(T)
    isnan(a+b+c) && return a + b + c

    if isentire(a)
        b == zero(b) && return c
        return entireinterval(T)

    elseif isentire(b)
        a == zero(a) && return c
        return entireinterval(T)

    end

    lo = setrounding(T, RoundDown) do
        lo1 = fma(a.lo, b.lo, c.lo)
        lo2 = fma(a.lo, b.hi, c.lo)
        lo3 = fma(a.hi, b.lo, c.lo)
        lo4 = fma(a.hi, b.hi, c.lo)
        min(lo1, lo2, lo3, lo4)
    end

    hi = setrounding(T, RoundUp) do
        hi1 = fma(a.lo, b.lo, c.hi)
        hi2 = fma(a.lo, b.hi, c.hi)
        hi3 = fma(a.hi, b.lo, c.hi)
        hi4 = fma(a.hi, b.hi, c.hi)
        max(hi1, hi2, hi3, hi4)
    end

    Interval(lo, hi)
end

and fma1 using manual directed rounding and the helping functions as defined above

function fma1(a::Interval{T}, b::Interval{T}, c::Interval{T}) where T

    (isempty(a) || isempty(b) || isempty(c)) && return emptyinterval(T)
    isnan(a+b+c) && return a + b + c

    if isentire(a)
        b == zero(b) && return c
        return entireinterval(T)

    elseif isentire(b)
        a == zero(a) && return c
        return entireinterval(T)
    end

    _, lo1 = fma_interval(a.lo, b.lo, c.lo)
    _, lo2 = fma_interval(a.lo, b.hi, c.lo)
    _, lo3 = fma_interval(a.hi, b.lo, c.lo)
    _, lo4 = fma_interval(a.hi, b.hi, c.lo)
    lo = min(lo1, lo2, lo3, lo4)

    hi1, _ = fma_interval(a.lo, b.lo, c.hi)
    hi2, _ = fma_interval(a.lo, b.hi, c.hi)
    hi3, _ = fma_interval(a.hi, b.lo, c.hi)
    hi4, _ = fma_interval(a.hi, b.hi, c.hi)
    hi = max(hi1, hi2, hi3, hi4)

    return Interval(lo, hi)
end

and the benchmarking

julia> a = b = c = 1.1..1.2
[1.09999, 1.20001]

julia> @btime fma($(Ref(a))[], $(Ref(b))[], $(Ref(c))[])
  1.018 ฮผs (0 allocations: 0 bytes)
[2.30999, 2.64001]

julia> @btime fma1($(Ref(a))[], $(Ref(b))[], $(Ref(c))[])
  61.071 ns (0 allocations: 0 bytes)
[2.30999, 2.64001]

so fma1 is also faster in addition to not using setrounding

lucaferranti avatar Mar 27 '21 19:03 lucaferranti

Wow, that's really much faster. That's why pretty convincing!

dpsanders avatar Mar 27 '21 19:03 dpsanders

15x is 10 times faster than 1.5x, which is my waterline for "worth the effort" I suppose fma1 is well worth the, apparently 1/10th effort, we expended. ๐Ÿ˜ƒ

JeffreySarnoff avatar Mar 27 '21 20:03 JeffreySarnoff

The version with manual directed rounded apparently fails for "half unbounded intervals", e.g.

julia> a = 1..2
[1, 2]

julia> b = -โˆž..3
[-โˆž, 3]

julia> c = 5..โˆž
[5, โˆž]

julia> fma(a, b, c)
[-โˆž, NaN]

Incidentally, now filtering out the NaN before taking the maximum/minimum fixes the problem (in the sense that all tests pass) ๐Ÿ˜ƒ . I replaced lo = min(lo1, lo2, lo3, lo4) with minimum(filter(!isnan, [lo1, lo2, lo3, lo4])) and max similarly and the new benchmark is

julia> a = b = c = 1.1..1.2
[1.09999, 1.20001]

julia> @btime fma($(Ref(a))[], $(Ref(b))[], $(Ref(c))[])
  228.507 ns (4 allocations: 448 bytes)
[2.30999, 2.64001]

so still faster than the original one but quite a loss in efficiency. I think the problem is that Inf - Inf = NaN and thus the help functions (like two_sum, two_fma ) can return a NaN for infinite input. I wonder whether these edge cases can be taken into account without sacrificing efficiency ๐Ÿค”

lucaferranti avatar Mar 28 '21 08:03 lucaferranti

Why not do something similar to

function FMA(a::T, b::T, c::T) where T
   fma_hi = fma(a, b, c)
   isnan(fma_hi) && return (T(NaN), T(NaN))
   isinf(fma_hi) && return (fma_hi, fma_hi)
   ....
end

When I have used this pattern in the past, I wrote a fast binary-based isnotfinite(fma_hi) and only checked isnan(fma_hi), isinf(fma_hi) when it had been determined that isnotfinite(fma_hi) (which happens rarely, so having the one conditional is much better than always running through two conditionals). Julia's !isfinite(x) has improved since then. It is faster to test !isfinite than to test isfinite.

JeffreySarnoff avatar Mar 28 '21 09:03 JeffreySarnoff

Ok I think I figured out how to avoid filtering NaNs. Here is a running example to explain what I mean. The problem is that if we want to compute e.g. fma([1, 2], [-โˆž, 3], [2, โˆž]) then the four candidates for the upper bound are

h1 = fma(1, -โˆž, โˆž) = NaN
h2 = fma(2, -โˆž, โˆž) = NaN
h3 = fma(1, 3, โˆž) = โˆž
h4 = fma(2, 3, โˆž) = โˆž

and the first two poison the computation of the maximum and hence the need for max/min_ignore_nans. Followinf Jeffrey's suggestions I modified the function hi, lo = interval_fma(a, b, c), which returns the rounded up (hi) and rounded down (lo) versions of fma as follows:

function interval_fma(a::T, b::T, c::T) where {T}

    hi = fma(a, b, c)
    isnan(hi) && return convert(T, -Inf), convert(T, Inf) # first is hi, second is lo
    !isfinite(hi) && return hi, hi

    hi, lo = two_fma(a, b, c)
    # and then like before

i.e. if the result of the scalar fma is NaN, it returns -Inf as upper bound and Inf as lower bound. This way those are automatically filtered out when taking the maximum of upper bounds and the minimum of lower bounds, because the only case in which min can return Inf is if all its inputs are Inf, which would mean that all 4 candidates would have returned NaN, which can happen only if some of the original input intervals were NaI or empty intervals or entire intervals, but these cases are already checked at the beginning of fma(a::Interval{T}, b::Interval{T}, c::Interval{T}).

Summa summarum, now all tests pass (the ITF1788 test suite has at least 108 tests checking those half unbounded intervals cases, so I guess they are quite exhaustive), no need for filtering NaN out and the benchmark is

julia> a = b = c = 1.1..1.2
[1.09999, 1.20001]

julia> @btime fma($(Ref(a))[], $(Ref(b))[], $(Ref(c))[])
  46.255 ns (0 allocations: 0 bytes)
[2.30999, 2.64001]

as a final small side note, at the beginning of fma(a::Interval{T}, b::Interval{T}, c::Interval{T}) I changed

isnan(a+b+c) && return a + b + c

to

(isnan(a) || isnan(b) || isnan(c)) && return a + b + c

squeezing a little more efficiency, now the benchmark is

julia> @btime fma($(Ref(a))[], $(Ref(b))[], $(Ref(c))[])
  35.549 ns (0 allocations: 0 bytes)
[2.30999, 2.64001]

I will soon polish the code and push it to the PR of this issue ( #443 ), comments, feedback and improvement suggestions are always welcome of course.

lucaferranti avatar Mar 28 '21 11:03 lucaferranti

With the benchmarking I did this morning

(isnan(a) || isnan(b) || isnan(c)) && return a + b + c

is sometimes less performant than each of these

isnan(a+b+c) && return a+b+c
isnan(fma(a,b,c)) && return fma(a,b,c)

(it goes so quickly that when none are NaN, the one-off timings are indistinguishable) Where code use brings issues of overlap and latency to the fore, the use of || is not as performant because

I displayed @code_native for each of the three, and then removed the instructions they all shared The version with '||' is the only one that has an internal jump (better avoided on general principles). Are your benchmarks using real-world code or just comparing these routines? If the latter, I am less persuaded. If the former, disregard this note.

JeffreySarnoff avatar Mar 28 '21 12:03 JeffreySarnoff