openlibm
openlibm copied to clipboard
Accuracy of single precision functions
@zimmermann6 has a publication comparing the accuracy of single precision functions across several libms.
https://members.loria.fr/PZimmermann/papers/accuracy.pdf
The tests are with openlibm 0.4.1, and it would be nice to be able to try the latest release. In general musl fares better.
While the max ULP is an important metric, two additional metrics may be more important: speed and distribution of ULP (if max ULP > 1). Consider j0f(x). It is well-known that polynomial approximations break down near zeros of a function. Exhaustive testing in the domain [1:4], which contains 1 zero, shows
% tlibm -DfEsP -x 1 -X 4 j0
Interval tested for j0f: [1,4]
1000000 calls, 0.090046 secs, 0.09005 usecs/call
ulp
The reference function was j0(x). The zeros of the regular Bessel functions are simple zeros, so the polynomial approximation can take this into account. For example, j0f(x) = (z - z0) * P(x) with the zero z0 split into high and low parts with a total of 48 bits of precision. This gives
% ./tlibm -DfEsP -x 1 -X 4 j0
Interval tested for j0f: [1,4]
1000000 calls, 0.045775 secs, 0.04578 usecs/call
ulp
A much more accurate approximation that is twice as fast!
Unfortunately, Bessel functions are not periodic so argument reduction cannot be used. Thus, one needs multiple polynomial approximations to cover a significant domain. J. Harrison [1] proposed an algorithm that uses multiple polynomial for x < 45 and an asymptotic expansion about the zeros for x > 45. I haven't had time to pursue this.
[1] (https://www.cl.cam.ac.uk/~jrh13/papers/bessel.html)
Dear Steve, on my side I am more interested by the max ULP. I know other people are more interested by speed. I guess the Intel Math Library implements the algorithm from John Harrison.
@zimmermann6 has a publication comparing the accuracy of single precision functions across several libms.
https://members.loria.fr/PZimmermann/papers/accuracy.pdf
The tests are with openlibm 0.4.1, and it would be nice to be able to try the latest release. In general musl fares better.
I have updated my results with OpenLibm 0.7.0. The results are almost the same, except for y1f whose largest error has improved. The url is still the same. Previous versions are available from https://members.loria.fr/PZimmermann/papers/
I have revised my note with double and quadruple precision, and bivariate functions like atan2, hypot, pow:
https://homepages.loria.fr/PZimmermann/papers/#accuracy
For double precision, the largest error I found for OpenLibm 0.7.0 is less than 4 ulps, except for the Bessel functions, and the lgamma and tgamma functions.
If something has changed in newer versions of OpenLibm, please tell me and I will update my note.
a new version of my note is now available, including the "long double" format:
https://homepages.loria.fr/PZimmermann/papers/#accuracy
I found a few issues with Openlibm, which I reported in separate new issues here.
I wonder what the easiest way is to fix some of these. Maybe just pick the good implementations from other libm libraries and bring them in here?
I am about to release of new version of the accuracy comparison. It will be with Openlibm 0.7.5, the last release. For single precision, the large error for powf has been fixed. It remains large errors for the Bessel functions (j0f, j1f, y0f, y1f) and lgammaf. All other functions have a maximal error less than 3.17 ulps.
I said above that in OpenLibm 0.7.5, for single precision, the large error for powf has been fixed. But I have found a new pair of inputs giving a huge error: for x=0x1.ffffecp-1 and y=-0x1.000002p+27 OpenLibm 0.7.5 yields +Inf, whereas the correct result is 0x1.557a86p115 (rounding to nearest).
This patch fixes the issue reported by Paul Zimmerman.
--- /usr/src/lib/msun/src/e_powf.c 2021-02-21 03:29:00.956878000 -0800
+++ src/e_powf.c 2021-09-06 08:17:09.800008000 -0700
@@ -136,7 +136,7 @@
/* |y| is huge */
if(iy>0x4d000000) { /* if |y| > 2**27 */
/* over/underflow if x is not close to one */
- if(ix<0x3f7ffff7) return (hy<0)? sn*huge*huge:sn*tiny*tiny;
+ if(ix<0x3f7ffff6) return (hy<0)? sn*huge*huge:sn*tiny*tiny;
if(ix>0x3f800007) return (hy>0)? sn*huge*huge:sn*tiny*tiny;
/* now |1-x| is tiny <= 2**-20, suffice to compute
log(x) by x-x^2/2+x^3/3-x^4/4 */
Fix in https://github.com/JuliaMath/openlibm/commit/a090d3e614bdca67d05c52cb346502f255836f46
within the CORE-MATH project, we now have a tool to measure the efficiency of math library functions. Here is a comparison between the CORE-MATH routines and OpenLibm: https://sympa.inria.fr/sympa/arc/core-math/2022-03/msg00011.html. The CORE-MATH rountines are under MIT license, and should be easy to integrate in OpenLibm.
here are updated timings for the single precision functions, with revision d14da3d of CORE-MATH, on an Intel i5-4590:
zimmerma@oignon:~/svn/core-math$ LIBM=/tmp/libopenlibm.a CORE_MATH_PERF_MODE=rdtsc ./perf-all.sh
GNU libc version: 2.33
GNU libc release: release
acoshf 25.235 34.838 57.576
acosf 30.597 33.853 28.818
asinhf 31.645 52.324 77.610
asinf 26.226 30.628 30.436
atan2f 41.272 92.957 83.223
atanhf 29.906 76.711 74.118
atanf 30.843 39.067 31.997
cbrtf 25.882 63.389 28.038
coshf 33.966 25.702 41.201
cosf 37.762 28.093 32.371
erfcf 58.559 104.513 111.192
erff 20.508 75.847 70.405
exp10f 34.963 13.060 13.053
exp2f 16.133 10.229 16.260
expm1f 18.394 54.059 42.504
expf 17.705 10.954 28.197
hypotf 14.142 17.062 56.209
log10f 17.930 24.270 29.549
log1pf 21.399 32.188 31.572
log2f 15.032 12.018 27.589
logf 16.051 12.016 26.305
powf 45.266 29.032 231.151
sinhf 30.979 82.966 65.106
sinf 29.070 26.443 31.827
tanhf 17.895 74.847 62.337
tanf 29.203 65.431 32.322
First column is CORE-MATH's reciprocal throughput, 2nd is for GNU libc, and 3rd for OpenLibm.