DoubleFloats.jl replacing `expm1` implementation

expm1(x::Double64) loses all digits of accuracy below abs(x) < 1e-163 and returns 0.0 instead of x.

This regresses even beyond Float64, which correctly returns x.

using DoubleFloats
x = 1e-200
@assert x == expm1(x)
xx = Double64(x)
@assert xx == expm1(xx)

Sep 01 '21 15:09 mwallerb

good catch

Sep 02 '21 01:09 JeffreySarnoff

here is my quick fix ... it should help you while I review it

function Base.expm1(a::DoubleFloat{T}) where {T<:DoubleFloats.IEEEFloat}
    isnan(a) && return a
    isinf(a) && return(signbit(a) ? zero(DoubleFloat{T}) : a)
    u = exp(a)
    # temp fix of if (u == one(DoubleFloat{T}))
    if (isone(u.hi))
        a
    elseif (u-1.0 == -one(DoubleFloat{T}))
        -one(DoubleFloat{T})
    else
        a*(u-1.0) / log(u)
    end
end

Sep 02 '21 01:09 JeffreySarnoff

https://github.com/JuliaMath/DoubleFloats.jl/issues/135 has a much better fix.

Jan 22 '22 17:01 oscardssmith

#135 has a much better fix.

Jan 31 '22 20:01 JeffreySarnoff

Shouldn't this remain open until #135 is merged?

Jan 31 '22 20:01 oscardssmith

yes

Jan 31 '22 20:01 JeffreySarnoff

I do not see the described behavior

using DoubleFloats

x = 1e-200
dfx = Double64(x)
expm1(dfx) 
# 1.0e-200
Double64(expm1(BigFloat(x)))
# 1.0e-200

I am closing this -- if there is something I missed, let me know

May 03 '22 08:05 JeffreySarnoff

this is still an issue for expm1(.5)

May 03 '22 11:05 oscardssmith

???

using DoubleFloats, Quadmath
x= 0.5
x128 = Float128(x)
xdf = Double64(x)
ans128 = expm1(x128)
ansdf = expm1(xdf)
bestansdf = Double64(ans128)

bestansdf - ansdf == 0.0

May 03 '22 14:05 JeffreySarnoff

Sorry, I chose the wrong number.

expm1(Double64(.0000000000000001))
1.0e-16
julia> expm1(big(Double64(.0000000000000001)))-ans
4.999999999999999957644533907012698185150443928822782614563950837394859595575927e-33
julia> eps(expm1(Double64(.0000000000000001)))
2.7369110631344083e-48

so in this case, DoubleFloats is off by 10e15 ULP

May 03 '22 15:05 oscardssmith

After spending some time with the implementation, it has become clear that the only fix is to replace expm1(::Double64) with something that works and runs within 2.5x-3.0x of the spotty expm1. I have done this. expm1 now converts its argument to a Float128 (from Quadmath), lets Quadmath compute the function and reconverts the result to Double64.. This is much faster than using BigFloat, and it is well traveled.

I did generate factorial coeffs for Double64 in the event we want to run the series solution another time. 1! .. 36! will fit in a Double64 without error. I derived the approach and computed the values that lets us use 37! .. 52!, for these each factorial is separated into two values of approximately equal magnitude, values which when multiplied together yield the corresponding factorial. Of course direct multiplication of these values would by inexact.

Consider 42!

𝒻𝒶𝒸(x) = factorial(BigInt(x))

factorial42 =  𝒻𝒶𝒸(42)
# 405006117752879898543142606244511569936384000000000
# this is too large to be held within a Double64 and be recovered

split_factorial(target, splitter) = 
    split_factorial(BigInt(target), BigInt(splitter))

function split_factorial(target::T, splitter::T) where {T::BigInt}
    splitter_factorial_contribution =  𝒻𝒶𝒸(splitter)
    target_factorial_contribution = pochhammer(splitter + 1, target - splitter)
    (splitter_factorial_contribution, target_factorial_contribution)
end

# borrowed from https://github.com/mfpaulos/JuliBoots (2018)
function pochhammer(x::BigInt, n)
    iszero(n) && return BigInt(1)
    if n > 0
         (x + n - 1) * pochhammer(x, n-1)
    else
       pochhammer(x, n+1) / (x+n)
    end
end

# back to 42!

factorial_factors_42 = split_factorial(42, 25)
# (15511210043330985984000000, 90580045904088539774976000)
ndigits.(factorial_factors_42)
# (26, 26)    
# at least within the range we are addressing, the most balanced splits
#     either have the same digit counts or they differ by 1
prod(factorial_factors_42) == 𝒻𝒶𝒸(42)

# and the split works with Double64
a = Double64.(factorial_factors_42)
b = BigInt.(a)
b == factorial_factors_42

For the record, these are appropriate for use with Double64 smallerfactorial values fit in a Double64, largerfactorial values fit in two Double64s.

smallerfactorials = [ (1, 1) (2, 2) (3, 6) (4, 24) (5, 120) (6, 720) (7, 5040) (8, 40320) (9, 362880) (10, 3628800) (11, 39916800) (12, 479001600) (13, 6227020800) (14, 87178291200) (15, 1307674368000) (16, 20922789888000) (17, 355687428096000) (18, 6402373705728000) (19, 121645100408832000) (20, 2432902008176640000) (21, 51090942171709440000) (22, 1124000727777607680000) (23, 25852016738884976640000) (24, 620448401733239439360000) (25, 15511210043330985984000000) (26, 403291461126605635584000000) (27, 10888869450418352160768000000) (28, 304888344611713860501504000000) (29, 8841761993739701954543616000000) (30, 265252859812191058636308480000000) (31, 8222838654177922817725562880000000) (32, 263130836933693530167218012160000000) (33, 8683317618811886495518194401280000000) (34, 295232799039604140847618609643520000000) (35, 10333147966386144929666651337523200000000) (36, 371993326789901217467999448150835200000000) ]

largerfactorials = [ (37, (1124000727777607680000, 12245324002983751680000)), (38, (25852016738884976640000, 20231404874494894080000)), (39, (620448401733239439360000, 32876032921054202880000)), (40, (620448401733239439360000, 1315041316842168115200000)), (41, (15511210043330985984000000, 2156667759621155708928000)), (42, (15511210043330985984000000, 90580045904088539774976000)), (43, (403291461126605635584000000, 149805460533684892704768000)), (44, (403291461126605635584000000, 6591440263482135279009792000)), (45, (10888869450418352160768000000, 10985733772470225465016320000)), (46, (10888869450418352160768000000, 505343753533630371390750720000)), (47, (304888344611713860501504000000, 848255586288593837691617280000)), (48, (8841761993739701954543616000000, 1404009246270776007213711360000)), (49, (8841761993739701954543616000000, 68796453067268024353471856640000)), (50, (265252859812191058636308480000000, 114660755112113373922453094400000)), (51, (265252859812191058636308480000000, 5847698510717782070045107814400000)), (52, (8222838654177922817725562880000000, 9809042663139505407817600204800000)) ]

Double64 inv factorials are exactly recoverable through factorial(big(30))

smaller_invfactorials = [ (1, Double64(1.0, 0.0)) (2, Double64(0.5, 0.0)) (3, Double64(0.16666666666666666, 9.25185853854297e-18)) (4, Double64(0.041666666666666664, 2.3129646346357427e-18)) (5, Double64(0.008333333333333333, 1.1564823173178714e-19)) (6, Double64(0.001388888888888889, -5.300543954373577e-20)) (7, Double64(0.0001984126984126984, 1.7209558293420705e-22)) (8, Double64(2.48015873015873e-5, 2.1511947866775882e-23)) (9, Double64(2.7557319223985893e-6, -1.858393274046472e-22)) (10, Double64(2.755731922398589e-7, 2.3767714622250297e-23)) (11, Double64(2.505210838544172e-8, -1.448814070935912e-24)) (12, Double64(2.08767569878681e-9, -1.20734505911326e-25)) (13, Double64(1.6059043836821613e-10, 1.2585294588752098e-26)) (14, Double64(1.1470745597729725e-11, 2.0655512752830745e-28)) (15, Double64(7.647163731819816e-13, 7.03872877733453e-30)) (16, Double64(4.779477332387385e-14, 4.399205485834081e-31)) (17, Double64(2.8114572543455206e-15, 1.6508842730861433e-31)) (18, Double64(1.5619206968586225e-16, 1.1910679660273754e-32)) (19, Double64(8.22063524662433e-18, 2.2141894119604265e-34)) (20, Double64(4.110317623312165e-19, 1.4412973378659527e-36)) (21, Double64(1.9572941063391263e-20, -1.3643503830087908e-36)) (22, Double64(8.896791392450574e-22, -7.911402614872376e-38)) (23, Double64(3.868170170630684e-23, -8.843177655482344e-40)) (24, Double64(1.6117375710961184e-24, -3.6846573564509766e-41)) (25, Double64(6.446950284384474e-26, -1.9330404233703465e-42)) (26, Double64(2.4795962632247976e-27, -1.2953730964765229e-43)) (27, Double64(9.183689863795546e-29, 1.4303150396787322e-45)) (28, Double64(3.279889237069838e-30, 1.5117542744029879e-46)) (29, Double64(1.1309962886447716e-31, 1.0498015412959506e-47)) (30, Double64(3.7699876288159054e-33, 2.5870347832750324e-49))

these below are approximations, above are exactly recoverable

(31, Double64(1.216125041553518e-34, 5.586290567888806e-51)) (32, Double64(3.8003907548547434e-36, 1.7457158024652518e-52)) (33, Double64(1.151633562077195e-37, -6.09957445788454e-54)) (34, Double64(3.387157535521162e-39, 5.09056148151085e-56)) (35, Double64(9.67759295863189e-41, 3.202295548645562e-57)) (36, Double64(2.6882202662866363e-42, 5.355061165943334e-59)) ];

invfactorials = [ Double64(0x1p+0, 0x0p+0), Double64(0x1p-1, 0x0p+0), Double64(0x1.5555555555555p-3, 0x1.5555555555555p-57), Double64(0x1.5555555555555p-5, 0x1.5555555555555p-59), Double64(0x1.1111111111111p-7, 0x1.1111111111111p-63), Double64(0x1.6c16c16c16c17p-10, -0x1.f49f49f49f49fp-65), Double64(0x1.a01a01a01a01ap-13, 0x1.a01a01a01a01ap-73), Double64(0x1.a01a01a01a01ap-16, 0x1.a01a01a01a01ap-76), Double64(0x1.71de3a556c734p-19, -0x1.c154f8ddc6cp-73), Double64(0x1.27e4fb7789f5cp-22, 0x1.cbbc05b4fa99ap-76), Double64(0x1.ae64567f544e4p-26, -0x1.c062e06d1f209p-80), Double64(0x1.1eed8eff8d898p-29, -0x1.2aec959e14c06p-83), Double64(0x1.6124613a86d09p-33, 0x1.f28e0cc748ebep-87), Double64(0x1.93974a8c07c9dp-37, 0x1.05d6f8a2efd1fp-92), Double64(0x1.ae7f3e733b81fp-41, 0x1.1d8656b0ee8cbp-97), Double64(0x1.ae7f3e733b81fp-45, 0x1.1d8656b0ee8cbp-101), Double64(0x1.952c77030ad4ap-49, 0x1.ac981465ddc6cp-103), Double64(0x1.6827863b97d97p-53, 0x1.eec01221a8b0bp-107), Double64(0x1.2f49b46814157p-57, 0x1.2650f61dbdcb4p-112), Double64(0x1.e542ba4020225p-62, 0x1.ea72b4afe3c2fp-120), Double64(0x1.71b8ef6dcf572p-66, -0x1.d043ae40c4647p-120), Double64(0x1.0ce396db7f853p-70, -0x1.aebcdbd20331cp-124), Double64(0x1.761b41316381ap-75, -0x1.3423c7d91404fp-130), Double64(0x1.f2cf01972f578p-80, -0x1.9ada5fcc1ab14p-135), Double64(0x1.3f3ccdd165fa9p-84, -0x1.58ddadf344487p-139), Double64(0x1.88e85fc6a4e5ap-89, -0x1.71c37ebd1654p-143), Double64(0x1.d1ab1c2dccea3p-94, 0x1.054d0c78aea14p-149), Double64(0x1.0a18a2635085dp-98, 0x1.b9e2e28e1aa54p-153), Double64(0x1.259f98b4358adp-103, 0x1.eaf8c39dd9bc5p-157), Double64(0x1.3932c5047d60ep-108, 0x1.832b7b530a627p-162) ];

May 04 '22 06:05 JeffreySarnoff

I implemented something here which seems to work: https://github.com/tuwien-cms/xprec/blob/v1.2.13/csrc/dd_arith.c#L162

Might be worth taking a look at ....

May 09 '22 21:05 mwallerb

That looks like a pretty good version, although it won't be the most accurate since it ends in 9 multiplications that are uncompensated, so I think the error will be around 10 ULP error.

May 09 '22 21:05 oscardssmith

@oscardssmith I took the exp from the QD library, so that problem is "imported" from there. For me, 10 ulps are acceptable, but of course you guys may feel differently. But I think at least the expm1 should not lose more than 1 ulp on top of that.

Anyway, is there any easy way to fix that uncompensated multiplication problem?

May 09 '22 22:05 mwallerb

You could compensate the multiplication, but it would likely be a noticeable slowdown. 10 ULP may just be the price here. (The other option would be a table-based method)

May 09 '22 22:05 oscardssmith

Here's a C++ implementation of expm1 and exp that is accurate to 2 ulps across the range and takes around 400 flops:

https://github.com/tuwien-cms/libxprec/blob/v0.4.2/src/exp.cxx

The idea is similar to what you are doing already ... but we simply split off the one in the expm1_kernel and add or do not add it depending on whether we need exp or expm1. Might be relatively straight-forward to port if you're so inclined.

Apr 18 '24 09:04 mwallerb