replacing `expm1` implementation
expm1(x::Double64) loses all digits of accuracy below abs(x) < 1e-163 and returns 0.0 instead of x.
This regresses even beyond Float64, which correctly returns x.
using DoubleFloats
x = 1e-200
@assert x == expm1(x)
xx = Double64(x)
@assert xx == expm1(xx)
good catch
here is my quick fix ... it should help you while I review it
function Base.expm1(a::DoubleFloat{T}) where {T<:DoubleFloats.IEEEFloat}
isnan(a) && return a
isinf(a) && return(signbit(a) ? zero(DoubleFloat{T}) : a)
u = exp(a)
# temp fix of if (u == one(DoubleFloat{T}))
if (isone(u.hi))
a
elseif (u-1.0 == -one(DoubleFloat{T}))
-one(DoubleFloat{T})
else
a*(u-1.0) / log(u)
end
end
https://github.com/JuliaMath/DoubleFloats.jl/issues/135 has a much better fix.
#135 has a much better fix.
Shouldn't this remain open until #135 is merged?
yes
I do not see the described behavior
using DoubleFloats
x = 1e-200
dfx = Double64(x)
expm1(dfx)
# 1.0e-200
Double64(expm1(BigFloat(x)))
# 1.0e-200
I am closing this -- if there is something I missed, let me know
this is still an issue for expm1(.5)
???
using DoubleFloats, Quadmath
x= 0.5
x128 = Float128(x)
xdf = Double64(x)
ans128 = expm1(x128)
ansdf = expm1(xdf)
bestansdf = Double64(ans128)
bestansdf - ansdf == 0.0
Sorry, I chose the wrong number.
expm1(Double64(.0000000000000001))
1.0e-16
julia> expm1(big(Double64(.0000000000000001)))-ans
4.999999999999999957644533907012698185150443928822782614563950837394859595575927e-33
julia> eps(expm1(Double64(.0000000000000001)))
2.7369110631344083e-48
so in this case, DoubleFloats is off by 10e15 ULP
After spending some time with the implementation, it has become clear that the only fix is to replace expm1(::Double64) with something that works and runs within 2.5x-3.0x of the spotty expm1. I have done this.
expm1 now converts its argument to a Float128 (from Quadmath), lets Quadmath compute the function and reconverts the result to Double64..
This is much faster than using BigFloat, and it is well traveled.
I did generate factorial coeffs for Double64 in the event we want to run the series solution another time. 1! .. 36! will fit in a Double64 without error. I derived the approach and computed the values that lets us use 37! .. 52!, for these each factorial is separated into two values of approximately equal magnitude, values which when multiplied together yield the corresponding factorial. Of course direct multiplication of these values would by inexact.
Consider 42!
π»πΆπΈ(x) = factorial(BigInt(x))
factorial42 = π»πΆπΈ(42)
# 405006117752879898543142606244511569936384000000000
# this is too large to be held within a Double64 and be recovered
split_factorial(target, splitter) =
split_factorial(BigInt(target), BigInt(splitter))
function split_factorial(target::T, splitter::T) where {T::BigInt}
splitter_factorial_contribution = π»πΆπΈ(splitter)
target_factorial_contribution = pochhammer(splitter + 1, target - splitter)
(splitter_factorial_contribution, target_factorial_contribution)
end
# borrowed from https://github.com/mfpaulos/JuliBoots (2018)
function pochhammer(x::BigInt, n)
iszero(n) && return BigInt(1)
if n > 0
(x + n - 1) * pochhammer(x, n-1)
else
pochhammer(x, n+1) / (x+n)
end
end
# back to 42!
factorial_factors_42 = split_factorial(42, 25)
# (15511210043330985984000000, 90580045904088539774976000)
ndigits.(factorial_factors_42)
# (26, 26)
# at least within the range we are addressing, the most balanced splits
# either have the same digit counts or they differ by 1
prod(factorial_factors_42) == π»πΆπΈ(42)
# and the split works with Double64
a = Double64.(factorial_factors_42)
b = BigInt.(a)
b == factorial_factors_42
For the record, these are appropriate for use with Double64 smallerfactorial values fit in a Double64, largerfactorial values fit in two Double64s.
smallerfactorials = [ (1, 1) (2, 2) (3, 6) (4, 24) (5, 120) (6, 720) (7, 5040) (8, 40320) (9, 362880) (10, 3628800) (11, 39916800) (12, 479001600) (13, 6227020800) (14, 87178291200) (15, 1307674368000) (16, 20922789888000) (17, 355687428096000) (18, 6402373705728000) (19, 121645100408832000) (20, 2432902008176640000) (21, 51090942171709440000) (22, 1124000727777607680000) (23, 25852016738884976640000) (24, 620448401733239439360000) (25, 15511210043330985984000000) (26, 403291461126605635584000000) (27, 10888869450418352160768000000) (28, 304888344611713860501504000000) (29, 8841761993739701954543616000000) (30, 265252859812191058636308480000000) (31, 8222838654177922817725562880000000) (32, 263130836933693530167218012160000000) (33, 8683317618811886495518194401280000000) (34, 295232799039604140847618609643520000000) (35, 10333147966386144929666651337523200000000) (36, 371993326789901217467999448150835200000000) ]
largerfactorials = [ (37, (1124000727777607680000, 12245324002983751680000)), (38, (25852016738884976640000, 20231404874494894080000)), (39, (620448401733239439360000, 32876032921054202880000)), (40, (620448401733239439360000, 1315041316842168115200000)), (41, (15511210043330985984000000, 2156667759621155708928000)), (42, (15511210043330985984000000, 90580045904088539774976000)), (43, (403291461126605635584000000, 149805460533684892704768000)), (44, (403291461126605635584000000, 6591440263482135279009792000)), (45, (10888869450418352160768000000, 10985733772470225465016320000)), (46, (10888869450418352160768000000, 505343753533630371390750720000)), (47, (304888344611713860501504000000, 848255586288593837691617280000)), (48, (8841761993739701954543616000000, 1404009246270776007213711360000)), (49, (8841761993739701954543616000000, 68796453067268024353471856640000)), (50, (265252859812191058636308480000000, 114660755112113373922453094400000)), (51, (265252859812191058636308480000000, 5847698510717782070045107814400000)), (52, (8222838654177922817725562880000000, 9809042663139505407817600204800000)) ]
Double64 inv factorials are exactly recoverable through factorial(big(30))
smaller_invfactorials = [ (1, Double64(1.0, 0.0)) (2, Double64(0.5, 0.0)) (3, Double64(0.16666666666666666, 9.25185853854297e-18)) (4, Double64(0.041666666666666664, 2.3129646346357427e-18)) (5, Double64(0.008333333333333333, 1.1564823173178714e-19)) (6, Double64(0.001388888888888889, -5.300543954373577e-20)) (7, Double64(0.0001984126984126984, 1.7209558293420705e-22)) (8, Double64(2.48015873015873e-5, 2.1511947866775882e-23)) (9, Double64(2.7557319223985893e-6, -1.858393274046472e-22)) (10, Double64(2.755731922398589e-7, 2.3767714622250297e-23)) (11, Double64(2.505210838544172e-8, -1.448814070935912e-24)) (12, Double64(2.08767569878681e-9, -1.20734505911326e-25)) (13, Double64(1.6059043836821613e-10, 1.2585294588752098e-26)) (14, Double64(1.1470745597729725e-11, 2.0655512752830745e-28)) (15, Double64(7.647163731819816e-13, 7.03872877733453e-30)) (16, Double64(4.779477332387385e-14, 4.399205485834081e-31)) (17, Double64(2.8114572543455206e-15, 1.6508842730861433e-31)) (18, Double64(1.5619206968586225e-16, 1.1910679660273754e-32)) (19, Double64(8.22063524662433e-18, 2.2141894119604265e-34)) (20, Double64(4.110317623312165e-19, 1.4412973378659527e-36)) (21, Double64(1.9572941063391263e-20, -1.3643503830087908e-36)) (22, Double64(8.896791392450574e-22, -7.911402614872376e-38)) (23, Double64(3.868170170630684e-23, -8.843177655482344e-40)) (24, Double64(1.6117375710961184e-24, -3.6846573564509766e-41)) (25, Double64(6.446950284384474e-26, -1.9330404233703465e-42)) (26, Double64(2.4795962632247976e-27, -1.2953730964765229e-43)) (27, Double64(9.183689863795546e-29, 1.4303150396787322e-45)) (28, Double64(3.279889237069838e-30, 1.5117542744029879e-46)) (29, Double64(1.1309962886447716e-31, 1.0498015412959506e-47)) (30, Double64(3.7699876288159054e-33, 2.5870347832750324e-49))
these below are approximations, above are exactly recoverable
(31, Double64(1.216125041553518e-34, 5.586290567888806e-51)) (32, Double64(3.8003907548547434e-36, 1.7457158024652518e-52)) (33, Double64(1.151633562077195e-37, -6.09957445788454e-54)) (34, Double64(3.387157535521162e-39, 5.09056148151085e-56)) (35, Double64(9.67759295863189e-41, 3.202295548645562e-57)) (36, Double64(2.6882202662866363e-42, 5.355061165943334e-59)) ];
invfactorials = [ Double64(0x1p+0, 0x0p+0), Double64(0x1p-1, 0x0p+0), Double64(0x1.5555555555555p-3, 0x1.5555555555555p-57), Double64(0x1.5555555555555p-5, 0x1.5555555555555p-59), Double64(0x1.1111111111111p-7, 0x1.1111111111111p-63), Double64(0x1.6c16c16c16c17p-10, -0x1.f49f49f49f49fp-65), Double64(0x1.a01a01a01a01ap-13, 0x1.a01a01a01a01ap-73), Double64(0x1.a01a01a01a01ap-16, 0x1.a01a01a01a01ap-76), Double64(0x1.71de3a556c734p-19, -0x1.c154f8ddc6cp-73), Double64(0x1.27e4fb7789f5cp-22, 0x1.cbbc05b4fa99ap-76), Double64(0x1.ae64567f544e4p-26, -0x1.c062e06d1f209p-80), Double64(0x1.1eed8eff8d898p-29, -0x1.2aec959e14c06p-83), Double64(0x1.6124613a86d09p-33, 0x1.f28e0cc748ebep-87), Double64(0x1.93974a8c07c9dp-37, 0x1.05d6f8a2efd1fp-92), Double64(0x1.ae7f3e733b81fp-41, 0x1.1d8656b0ee8cbp-97), Double64(0x1.ae7f3e733b81fp-45, 0x1.1d8656b0ee8cbp-101), Double64(0x1.952c77030ad4ap-49, 0x1.ac981465ddc6cp-103), Double64(0x1.6827863b97d97p-53, 0x1.eec01221a8b0bp-107), Double64(0x1.2f49b46814157p-57, 0x1.2650f61dbdcb4p-112), Double64(0x1.e542ba4020225p-62, 0x1.ea72b4afe3c2fp-120), Double64(0x1.71b8ef6dcf572p-66, -0x1.d043ae40c4647p-120), Double64(0x1.0ce396db7f853p-70, -0x1.aebcdbd20331cp-124), Double64(0x1.761b41316381ap-75, -0x1.3423c7d91404fp-130), Double64(0x1.f2cf01972f578p-80, -0x1.9ada5fcc1ab14p-135), Double64(0x1.3f3ccdd165fa9p-84, -0x1.58ddadf344487p-139), Double64(0x1.88e85fc6a4e5ap-89, -0x1.71c37ebd1654p-143), Double64(0x1.d1ab1c2dccea3p-94, 0x1.054d0c78aea14p-149), Double64(0x1.0a18a2635085dp-98, 0x1.b9e2e28e1aa54p-153), Double64(0x1.259f98b4358adp-103, 0x1.eaf8c39dd9bc5p-157), Double64(0x1.3932c5047d60ep-108, 0x1.832b7b530a627p-162) ];
I implemented something here which seems to work: https://github.com/tuwien-cms/xprec/blob/v1.2.13/csrc/dd_arith.c#L162
Might be worth taking a look at ....
That looks like a pretty good version, although it won't be the most accurate since it ends in 9 multiplications that are uncompensated, so I think the error will be around 10 ULP error.
@oscardssmith I took the exp from the QD library, so that problem is "imported" from there. For me, 10 ulps are acceptable, but of course you guys may feel differently. But I think at least the expm1 should not lose more than 1 ulp on top of that.
Anyway, is there any easy way to fix that uncompensated multiplication problem?
You could compensate the multiplication, but it would likely be a noticeable slowdown. 10 ULP may just be the price here. (The other option would be a table-based method)
Here's a C++ implementation of expm1 and exp that is accurate to 2 ulps across the range and takes around 400 flops:
https://github.com/tuwien-cms/libxprec/blob/v0.4.2/src/exp.cxx
The idea is similar to what you are doing already ... but we simply split off the one in the expm1_kernel and add or do not add it depending on whether we need exp or expm1. Might be relatively straight-forward to port if you're so inclined.