phobos
phobos copied to clipboard
std.math.exponential: Implement log/log2/log10 for double- and float-precision
Software implementations are unchanged - only conditionally compiling in the log(x) = z + z^^3 R(z) / S(z)
branch depending on the float format (as can be seen when reviewing with w=1
).
Coefficients are taken from Cephes single and cmath libraries.
FYI @kinke.
Thanks for your pull request, @ibuclaw!
Bugzilla references
Your PR doesn't reference any Bugzilla issue.
If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.
Testing this PR locally
If you don't have a local development environment setup, you can use Digger to test this PR:
dub run digger -- build "master + phobos#8499"
Testsuite failure will be fixed in dlang/dmd#14270.
Thanks for pushing this forward! This was one of the things on my personal backlog to tackle 😁
It's now time for LLVM/GCC intrinsics to shine.
Users who rely on implicit cast will see a slightly different value, due to precision conversion, right?
Thanks for pushing this forward! This was one of the things on my personal backlog to tackle grin
It's now time for LLVM/GCC intrinsics to shine.
There's no intrinsic per say available for these functions. LLVM/GCC may recognize and compute them at compile-time, other than that, any __builtin_
versions of these would just forward to the mathlib, which may not satisfy D's pure requirements.
Users who rely on implicit cast will see a slightly different value, due to precision conversion, right?
If you were implicitly casting from an integer, you'll get a compile-time error that'll need fixing up, if you were implicitly converting from a float, you will indeed see a slightly different value.
Added deprecations for calling the logN family functions with an integral argument. That's the quickest way to get this over the line, rather than waiting for downstreams to fix their codebases.
@ibuclaw The documentation for log1p
says "For very small x, log1p(x) will be more accurate than log(1 + x).". It looks to me that when calling log1p
with double
s or float
s that it basically just calls logImpl(1+x)
. I see some specialization in the real
overload for the x87 that I don't really understand, but I could assume that it handles this greater accuracy part. Am I missing something, or does the documentation need to get improved here?
@ibuclaw The documentation for
log1p
says "For very small x, log1p(x) will be more accurate than log(1 + x).". It looks to me that when callinglog1p
withdouble
s orfloat
s that it basically just callslogImpl(1+x)
. I see some specialization in thereal
overload for the x87 that I don't really understand, but I could assume that it handles this greater accuracy part. Am I missing something, or does the documentation need to get improved here?
That comment is true for x87, and was likely taken straight from the Intel manual for fyl2xp1.
For small epsilon (ε) values, more significant digits can be retained by using the FYL2XP1 instruction than by using (ε+1) as an argument to the FYL2X instruction.
Unless you're using DMD, you'll always be using the soft-based implementation, where that note never applies.
@ibuclaw The comment at the top of that file also says that the implementation of log1p
is based on that from CEPHES, but it doesn't look that way to me (based on this webassembly version that I assume is the same code as the C one [1]).
[`] https://github.com/nearform/node-cephes/blob/dcf7c538405d6e115bfaf8f5346b588daa4bd2b2/cephes/unity.c
@ibuclaw The comment at the top of that file also says that the implementation of
log1p
is based on that from CEPHES, but it doesn't look that way to me (based on this webassembly version that I assume is the same code as the C one [1]). [`] https://github.com/nearform/node-cephes/blob/dcf7c538405d6e115bfaf8f5346b588daa4bd2b2/cephes/unity.c
For the (originally ported) real implementation, both logl
and log1pl
are identical save for x = xm1 + 1.0L;
, so the forwarding to logImpl(x+1)
was a simplification. This has been carried over into the double and float ports too. There's likely a small optimization to be made if we were to instead port the double/float implementations of log1p from Cephes too, but we aren't loosing on inaccuracy by using the float/double port of log
instead.
@ibuclaw I understanding the thinking. I'm just not happy with the documentation at the moment.
I don't see a float implementation of log1p
from Cephes, only double. I think glibc has one.
@ibuclaw If you use log1p(1.0e-15) you get a value of like 1.11e-15 instead of closer to 1e-15 and if you use log1p(1.0e-16) then the result is 0 instead of the same. What I'm not sure I appreciated enough is that the coefficients from CEPHES (at least for double) are in std.math.exponential (https://github.com/dlang/phobos/blob/6d647c582569390ba04b8c172456c8e3134e8c33/std/math/exponential.d#L2939). I think the function intends to provide more accuracy for log when it is close to 1, but it actually fails at doing that. The problem is that anyone who thinks they are actually getting that accuracy for log1p isn't actually getting it.
See: https://issues.dlang.org/show_bug.cgi?id=23750
I have a test generator, the output of I will raise a PR against phobos for, because as it turns out the number of unittests in phobos for the logN family is approximately 1 per function. Shocking.
This is the diff output of running it with an inlined version of log1p for SQRT1_2 <= x+1 <= SQRT2
.
--- ../libphobos/dlog.d 2023-02-21 15:06:15.766105898 +0100
+++ ../libphobos/dlog1.d 2023-02-28 03:04:01.353245637 +0100
@@ -333,11 +333,11 @@ static foreach (alias logfn; AliasSeq!(l
double[2][16] vals = [
[double.nan, double.nan],[-double.nan, double.nan],
[double.infinity, double.infinity], [-double.infinity, double.nan],
- [double.min_normal, 0x0p+0], [-double.min_normal, 0x0p+0],
+ [double.min_normal, 0x1p-1022], [-double.min_normal, -0x1p-1022],
[double.max, 0x1.62e42fefa39efp+9], [-double.max, double.nan],
- [double.min_normal / 2, 0x0p+0], [-double.min_normal / 2, 0x0p+0],
+ [double.min_normal / 2, 0x0.8p-1022], [-double.min_normal / 2, -0x0.8p-1022],
[double.max / 2, 0x1.628b76e3a7b61p+9], [-double.max / 2, double.nan],
- [double.min_normal / 3, 0x0p+0], [-double.min_normal / 3, 0x0p+0],
+ [double.min_normal / 3, 0x0.5555555555555p-1022], [-double.min_normal / 3, -0x0.5555555555555p-1022],
[double.max / 3, 0x1.6257909bce36ep+9], [-double.max / 3, double.nan],
];
testLog1p(vals);
Difference between gdc (cephes) and dmd (inline_yl2x)
--- ../libphobos/cephes.d 2023-02-28 04:00:19.064218681 +0100
+++ ../libphobos/yl2x.d 2023-02-28 04:00:29.916500757 +0100
@@ -162,8 +162,8 @@ static void testLog(T)(T[2][] vals)
[F(0x1p+0), F(0x0p+0)], [F(0x1p+1), F(0x1.62e42fefa39ef358p-1)], [F(0x1p+2), F(0x1.62e42fefa39ef358p+0)],
[F(0x1p+3), F(0x1.0a2b23f3bab73682p+1)], [F(0x1p+4), F(0x1.62e42fefa39ef358p+1)], [F(0x1p+5), F(0x1.bb9d3beb8c86b02ep+1)],
[F(0x1p+6), F(0x1.0a2b23f3bab73682p+2)], [F(0x1p+7), F(0x1.3687a9f1af2b14ecp+2)], [F(0x1p+8), F(0x1.62e42fefa39ef358p+2)],
- [F(0x1p+9), F(0x1.8f40b5ed9812d1c2p+2)], [F(0x1p+10), F(0x1.bb9d3beb8c86b02ep+2)], [F(0x1p+11), F(0x1.e7f9c1e980fa8e98p+2)],
- [F(0x1.8p+1), F(0x1.193ea7aad030a976p+0)], [F(0x1.4p+2), F(0x1.9c041f7ed8d336bp+0)], [F(0x1.cp+2), F(0x1.f2272ae325a57546p+0)],
+ [F(0x1p+9), F(0x1.8f40b5ed9812d1c4p+2)], [F(0x1p+10), F(0x1.bb9d3beb8c86b02ep+2)], [F(0x1p+11), F(0x1.e7f9c1e980fa8e98p+2)],
+ [F(0x1.8p+1), F(0x1.193ea7aad030a978p+0)], [F(0x1.4p+2), F(0x1.9c041f7ed8d336bp+0)], [F(0x1.cp+2), F(0x1.f2272ae325a57548p+0)],
[F(0x1.ep+3), F(0x1.5aa16394d481f014p+1)], [F(0x1.1p+4), F(0x1.6aa6bc1fa7f79cfp+1)], [F(0x1.fp+4), F(0x1.b78ce48912b59f12p+1)],
[F(0x1.08p+5), F(0x1.bf8d8f4d5b8d1038p+1)], [F(0x1.f8p+5), F(0x1.09291e8e3181b20ep+2)], [F(0x1.04p+6), F(0x1.0b292939429755ap+2)],
[F(-0x0p+0), -F.infinity], [F(0x0p+0), -F.infinity], [F(0x1.388p+13), F(0x1.26bb1bbb5551582ep+3)],
@@ -234,11 +234,11 @@ static void testLog10(T)(T[2][] vals)
{{
F[2][24] vals = [
[F(0x1p+0), F(0x0p+0)], [F(0x1p+1), F(0x1.34413509f79fef32p-2)], [F(0x1p+2), F(0x1.34413509f79fef32p-1)],
- [F(0x1p+3), F(0x1.ce61cf8ef36fe6cap-1)], [F(0x1p+4), F(0x1.34413509f79fef32p+0)], [F(0x1p+5), F(0x1.8151824c7587eafep+0)],
- [F(0x1p+6), F(0x1.ce61cf8ef36fe6cap+0)], [F(0x1p+7), F(0x1.0db90e68b8abf14cp+1)], [F(0x1p+8), F(0x1.34413509f79fef32p+1)],
+ [F(0x1p+3), F(0x1.ce61cf8ef36fe6ccp-1)], [F(0x1p+4), F(0x1.34413509f79fef32p+0)], [F(0x1p+5), F(0x1.8151824c7587eafep+0)],
+ [F(0x1p+6), F(0x1.ce61cf8ef36fe6ccp+0)], [F(0x1p+7), F(0x1.0db90e68b8abf14cp+1)], [F(0x1p+8), F(0x1.34413509f79fef32p+1)],
[F(0x1p+9), F(0x1.5ac95bab3693ed18p+1)], [F(0x1p+10), F(0x1.8151824c7587eafep+1)], [F(0x1p+11), F(0x1.a7d9a8edb47be8e4p+1)],
- [F(0x1.8p+1), F(0x1.e8927964fd5fd08cp-2)], [F(0x1.4p+2), F(0x1.65df657b04300868p-1)], [F(0x1.cp+2), F(0x1.b0b0b0b78cc3f296p-1)],
- [F(0x1.ep+3), F(0x1.2d145116c16ff856p+0)], [F(0x1.1p+4), F(0x1.3afeb354b7d9731ap+0)], [F(0x1.fp+4), F(0x1.7dc9e145867e62eap+0)],
+ [F(0x1.8p+1), F(0x1.e8927964fd5fd08ep-2)], [F(0x1.4p+2), F(0x1.65df657b04300868p-1)], [F(0x1.cp+2), F(0x1.b0b0b0b78cc3f298p-1)],
+ [F(0x1.ep+3), F(0x1.2d145116c16ff858p+0)], [F(0x1.1p+4), F(0x1.3afeb354b7d9731cp+0)], [F(0x1.fp+4), F(0x1.7dc9e145867e62eap+0)],
[F(0x1.08p+5), F(0x1.84bd545e4baeddp+0)], [F(0x1.f8p+5), F(0x1.cca1950e4511e192p+0)], [F(0x1.04p+6), F(0x1.d01b16f9433cf7b8p+0)],
[F(-0x0p+0), -F.infinity], [F(0x0p+0), -F.infinity], [F(0x1.388p+13), F(0x1p+2)],
];
@@ -265,8 +265,8 @@ static void testLog10(T)(T[2][] vals)
[double.max, 0x1.34413509f79ffp+8], [-double.max, double.nan],
[double.min_normal / 2, -0x1.33f424bcb522p+8], [-double.min_normal / 2, double.nan],
[double.max / 2, 0x1.33f424bcb522p+8], [-double.max / 2, double.nan],
- [double.min_normal / 3, -0x1.3421390dcbe37p+8], [-double.min_normal / 3, double.nan],
- [double.max / 3, 0x1.33c7106b9e609p+8], [-double.max / 3, double.nan],
+ [double.min_normal / 3, -0x1.3421e242e09fp+8], [-double.min_normal / 3, double.nan],
+ [double.max / 3, 0x1.33c6673689a51p+8], [-double.max / 3, double.nan],
];
testLog10(vals);
}
@@ -279,7 +279,7 @@ static void testLog10(T)(T[2][] vals)
[real.max, 0x1.34413509f79fef32p+12L], [-real.max, real.nan],
[real.min_normal / 2, -0x1.343c6405237810b2p+12L], [-real.min_normal / 2, real.nan],
[real.max / 2, 0x1.343c6405237810b2p+12L], [-real.max / 2, real.nan],
- [real.min_normal / 3, -0x1.343f354a34e427bp+12L], [-real.min_normal / 3, real.nan],
+ [real.min_normal / 3, -0x1.343f354a34e427b2p+12L], [-real.min_normal / 3, real.nan],
[real.max / 3, 0x1.343992c0120bf9b2p+12L], [-real.max / 3, real.nan],
];
testLog10(vals);
@@ -307,11 +307,11 @@ static void testLog1p(T)(T[2][] vals)
static foreach (F; AliasSeq!(float, double, real))
{{
F[2][24] vals = [
- [F(0x1p+0), F(0x1.62e42fefa39ef358p-1)], [F(0x1p+1), F(0x1.193ea7aad030a976p+0)], [F(0x1p+2), F(0x1.9c041f7ed8d336bp+0)],
+ [F(0x1p+0), F(0x1.62e42fefa39ef358p-1)], [F(0x1p+1), F(0x1.193ea7aad030a978p+0)], [F(0x1p+2), F(0x1.9c041f7ed8d336bp+0)],
[F(0x1p+3), F(0x1.193ea7aad030a976p+1)], [F(0x1p+4), F(0x1.6aa6bc1fa7f79cfp+1)], [F(0x1p+5), F(0x1.bf8d8f4d5b8d1038p+1)],
[F(0x1p+6), F(0x1.0b292939429755ap+2)], [F(0x1p+7), F(0x1.37072a9b5b6cb31p+2)], [F(0x1p+8), F(0x1.63241004e9010ad8p+2)],
- [F(0x1p+9), F(0x1.8f60adf041bde2a8p+2)], [F(0x1p+10), F(0x1.bbad39ebe1cc08b6p+2)], [F(0x1p+11), F(0x1.e801c1698ba4395cp+2)],
- [F(0x1.8p+1), F(0x1.62e42fefa39ef358p+0)], [F(0x1.4p+2), F(0x1.cab0bfa2a2002322p+0)], [F(0x1.cp+2), F(0x1.0a2b23f3bab73682p+1)],
+ [F(0x1p+9), F(0x1.8f60adf041bde2aap+2)], [F(0x1p+10), F(0x1.bbad39ebe1cc08b6p+2)], [F(0x1p+11), F(0x1.e801c1698ba4395ep+2)],
+ [F(0x1.8p+1), F(0x1.62e42fefa39ef358p+0)], [F(0x1.4p+2), F(0x1.cab0bfa2a2002324p+0)], [F(0x1.cp+2), F(0x1.0a2b23f3bab73682p+1)],
[F(0x1.ep+3), F(0x1.62e42fefa39ef358p+1)], [F(0x1.1p+4), F(0x1.71f7b3a6b918664cp+1)], [F(0x1.fp+4), F(0x1.bb9d3beb8c86b02ep+1)],
[F(0x1.08p+5), F(0x1.c35fc81b90df59c6p+1)], [F(0x1.f8p+5), F(0x1.0a2b23f3bab73682p+2)], [F(0x1.04p+6), F(0x1.0c234da4a23a6686p+2)],
[F(-0x0p+0), F(-0x0p+0)], [F(0x0p+0), F(0x0p+0)], [F(0x1.388p+13), F(0x1.26bbed6fbd84182ep+3)],
@@ -335,11 +335,11 @@ static void testLog1p(T)(T[2][] vals)
double[2][16] vals = [
[double.nan, double.nan],[-double.nan, double.nan],
[double.infinity, double.infinity], [-double.infinity, double.nan],
- [double.min_normal, 0x1p-1022], [-double.min_normal, -0x1p-1022],
+ [double.min_normal, 0x0p+0], [-double.min_normal, 0x0p+0],
[double.max, 0x1.62e42fefa39efp+9], [-double.max, double.nan],
- [double.min_normal / 2, 0x0.8p-1022], [-double.min_normal / 2, -0x0.8p-1022],
+ [double.min_normal / 2, 0x0p+0], [-double.min_normal / 2, 0x0p+0],
[double.max / 2, 0x1.628b76e3a7b61p+9], [-double.max / 2, double.nan],
- [double.min_normal / 3, 0x0.5555555555555p-1022], [-double.min_normal / 3, -0x0.5555555555555p-1022],
+ [double.min_normal / 3, 0x0p+0], [-double.min_normal / 3, 0x0p+0],
[double.max / 3, 0x1.6257909bce36ep+9], [-double.max / 3, double.nan],
];
testLog1p(vals);
@@ -349,11 +349,11 @@ static void testLog1p(T)(T[2][] vals)
real[2][16] vals = [
[real.nan, real.nan],[-real.nan, real.nan],
[real.infinity, real.infinity], [-real.infinity, real.nan],
- [real.min_normal, 0x0p+0L], [-real.min_normal, 0x0p+0L],
+ [real.min_normal, 0x1p-16382L], [-real.min_normal, -0x1p-16382L],
[real.max, 0x1.62e42fefa39ef358p+13L], [-real.max, real.nan],
- [real.min_normal / 2, 0x0p+0L], [-real.min_normal / 2, 0x0p+0L],
+ [real.min_normal / 2, 0x0.8p-16382L], [-real.min_normal / 2, -0x0.8p-16382L],
[real.max / 2, 0x1.62dea45ee3e064dcp+13L], [-real.max / 2, real.nan],
- [real.min_normal / 3, 0x0p+0L], [-real.min_normal / 3, 0x0p+0L],
+ [real.min_normal / 3, 0x0.5555555555555556p-16382L], [-real.min_normal / 3, -0x0.5555555555555556p-16382L],
[real.max / 3, 0x1.62db65fa664871d2p+13L], [-real.max / 3, real.nan],
];
testLog1p(vals);
Tthis is my version log1pImpl
that seems to work as I would expect. The logImpl
version kind of assumes that the variable is around 1. When I tried to do a version could be called from both I was getting errors. In retrospect I think it could be fixed by assuming exp=0`, but I think this is a bit cleaner. The functionality to return the C1/C2 could be modified to be like LogCoeffs.
private T log1pImpl(T)(T x) @safe pure nothrow @nogc
{
import std.math.algebraic: poly;
import std.math.constants: SQRT2, SQRT1_2;
import std.math.traits : isNaN, isInfinity, signbit;
// Special cases.
if (isNaN(x) || x == 0.0)
return x;
if (isInfinity(x) && !signbit(x))
return x;
if (x == -1.0)
return -T.infinity;
if (x < -1.0)
return T.nan;
if ((1.0 + x) > SQRT2 || (1.0 + x) < SQRT1_2)
return logImpl(x + 1.0);
alias F = floatTraits!T;
alias coeffs = LogCoeffs!T;
static if (F.realFormat == RealFormat.ieeeExtended ||
F.realFormat == RealFormat.ieeeExtended53 ||
F.realFormat == RealFormat.ieeeQuadruple)
{
// C1 + C2 = LN2.
enum T C1 = 6.93145751953125E-1L;
enum T C2 = 1.428606820309417232121458176568075500134E-6L;
}
else static if (F.realFormat == RealFormat.ieeeDouble)
{
enum T C1 = 0.693359375;
enum T C2 = -2.121944400546905827679e-4;
}
else static if (F.realFormat == RealFormat.ieeeSingle)
{
enum T C1 = 0.693359375;
enum T C2 = -2.12194440e-4;
}
else
static assert(0, "Not implemented for this architecture");
T y;
T z = x * x;
static if (F.realFormat == RealFormat.ieeeSingle)
y = x * (z * poly(x, coeffs.logP));
else
y = x * (z * poly(x, coeffs.logP) / poly(x, coeffs.logQ));
return x - 0.5 * z + y;
}
Tthis is my version
log1pImpl
that seems to work as I would expect. ThelogImpl
version kind of assumes that the variable is around 1. When I tried to do a version could be called from both I was getting errors. In retrospect I think it could be fixed by assuming exp=0`, but I think this is a bit cleaner. The functionality to return the C1/C2 could be modified to be like LogCoeffs.
Right, I just added the following block for doubles.
static if (floatTraits!T.realFormat == RealFormat.ieeeDouble)
{
// When the input is within the range 1/sqrt(2) <= x+1 <= sqrt(2), compute
// log1p inline. Forwarding to log() would otherwise result in inaccuracies.
const T xp1 = x + 1.0;
if (xp1 >= SQRT1_2 && xp1 <= SQRT2)
{
alias coeffs = LogCoeffs!T;
T px = poly(x, coeffs.logp1P);
T qx = poly(x, coeffs.logp1Q);
const T xx = x * x;
qx = x + ((cast(T) -0.5) * xx + x * (xx * px / qx));
return qx;
}
}
return logImpl(x + 1.0);
I see that the thinko is that log1p
reuses the original x
if it is a small number, otherwise the +1.0
would destroy its value.
Not exactly sure how to read the test output you have, but the inline D version is done with reals, correct?
You're right that the +1.0 would destroy its value. That's why log1p
as a function exists. Not sure why you would remove the part that handles floats, but I didn't test for that.
Not exactly sure how to read the test output you have, but the inline D version is done with reals, correct?
-
There are some cases where the inline yl2x branch is inaccurate. I checked against wolfram and cephes result matches the most number of digits.
-
log1p(double)
for the inline yl2x branch is broken for really small numbers. -
log1p(real)
for the cephes branch is broken for really small numbers.
This is the gist of what I'm generating, just generates a bunch of unittest blocks to put into phobos once we've sorted out the problems.
https://pastebin.com/ceNJuyqL