openlibm
openlibm copied to clipboard
Faster logl implementation available in FreeBSD msun
The musl libm looks really good and reasonably updated. It would be worthwhile to see if we can set up some scripts to extract to update openlibm. Ideally - we'd be able to do this automatically!
For example, there's a recent new log implementation: https://git.musl-libc.org/cgit/musl/tree/src/math/log.c
Looks like the rust guys have done it and maybe worth checking out as well: https://github.com/rust-lang/libm/tree/master/src/math
The downside is that it is hard to keep up with the upstream, although nothing much usually changes for a long time.
Do you mean just indiscriminately replacing all the openlibm implementations with musl implementations, or would somebody first check which implementation is better?
I am certainly not suggesting indiscriminately replacing it - updating the title since it is a bit misleading. I think we probably want a separate repo to test it out for performance and correctness, and then either handpick specific routines or replace it wholesale.
For example, there's a recent new
logimplementation: https://git.musl-libc.org/cgit/musl/tree/src/math/log.c
If you're going to update log, you should consider using the table-driven methods of PTP Tang (ACM Trans Math Soft, Vol. 16, No. 4, December 1990, Pages 378-400).
For Intel-80 bit and IEEE754 128-bit implementations see https://svnweb.freebsd.org/base/head/lib/msun/ld80/s_logl.c?view=log https://svnweb.freebsd.org/base/head/lib/msun/ld128/s_logl.c?view=log
Borrowing the msun log should be fairly straightforward for us since that is what we forked from a while ago. Are there any other msun improvements we should pull in?
@kargl Looks like that gives us a better logl, but log remains the same.
@kargl Looks like that gives us a better
logl, butlogremains the same.
Yeah, when logl was committed (and the same with expl), the old Sun fdlibm float and double versions were retained. That was partly my fault. I looked at the MUSL code you pointed to. It seems to be a table driven method. I cannot tell it if it is based on PTP Tang's paper or not.
Would be nice to benchmark the musl implementation. We know we have a slow log - https://github.com/JuliaLang/julia/issues/8869
cc @simonbyrne
@kargl Would you be able to provide fast log implementations here? That would be fantastic if possible.
I copied bits of the musl log into a PR in #204 for experimentation.
@kargl Would you be able to provide fast
logimplementations here? That would be fantastic if possible.
Just discovered that github sets noreply@ addresses in email headers, so my replies have gone off into the ether.
It is unlikely that I'll directly contribute to olibm as I don't use git and olibm source code has deviated from FreeBSD source code. I did some email spelunking and found a 2008 email from Bruce Evans (FreeBSd libm guru) with his table-driven implementation for logf(). Fortunately, Bruce used a 2-clause BSD license in that code; unfortnately, Bruce recently passed and I'm uncomfortable sending his code here.
I have started to rewriting my libm testing code, and have run some tests against FreeBSD libm and olibm. The reference log() is from MPFR with 256 bits of precision. The hardware is x86_64.
FreeBSD libm:
./tlibm -l -P -x 0.5 -X 10 -N 5 -s log Interval tested for logl: [0.5,10] 5000000 calls, 0.314830 secs, 0.06297 usecs/call ulp <= 0.5: 99.999% 4999926 | 99.999% 4999926 0.5 < ulp < 0.6: 0.001% 74 | 100.000% 5000000 Max ulp: 0.508292 at 1.003405100681020136215e+00
Olibm:
./tlibm -l -P -x 0.5 -X 10 -N 5 -s log Interval tested for logl: [0.5,10] 5000000 calls, 0.930705 secs, 0.18614 usecs/call ulp <= 0.5: 92.230% 4611524 | 92.230% 4611524 0.5 < ulp < 0.6: 5.233% 261630 | 97.463% 4873154 0.6 < ulp < 0.7: 1.782% 89085 | 99.245% 4962239 0.7 < ulp < 0.8: 0.590% 29483 | 99.834% 4991722 0.8 < ulp < 0.9: 0.144% 7208 | 99.979% 4998930 0.9 < ulp < 1.0: 0.020% 990 | 99.998% 4999920 1.0 < ulp < 1.5: 0.002% 80 | 100.000% 5000000 Max ulp: 1.081120 at 1.423922684784536956911e+00
So, logl() from FreeBSD is 3 times faster and quite a bit better for accuracy. olibm would benefit from grabbing the code. There are extensive comments.
a comparison of the accurary of different libraries (including Openlibm) for single precision and rounding to nearest is available at https://members.loria.fr/PZimmermann/papers/accuracy.pdf. Except for erf, Musl has a better accuracy than Openlibm.
Thanks @zimmermann6! And thank you for creating a github account to post here.
I think we may want to split some of these discussions into separate issues.
@zimmermann6 Is there a repo / test code somewhere, which we can use and incorporate into openlibm tests?
Moving the larger point around single precision to #212.
@zimmermann6 Is there a repo / test code somewhere, which we can use and incorporate into openlibm tests?
sorry there is no code available yet, but I can provide inputs for which large ulp errors are obtained.
Looked into what it might take to pull in the newer logl from msun, and it touches a lot of other code.
In case this information is useful, I just ran benchmarks of openlibm (compiled by gcc and clang) and system libm, and these were the results:
openlibm(gcc) syslibm openlibm(clang)
pow : 17.4427 MPS : 69.7010 MPS : 18.1047 MPS
hypot : 72.7657 MPS : 107.5491 MPS : 76.4914 MPS
exp : 132.5313 MPS : 197.3627 MPS : 145.4958 MPS
log : 129.2455 MPS : 216.8139 MPS : 130.2769 MPS
log10 : 103.1023 MPS : 118.6334 MPS : 96.9040 MPS
sin : 176.9966 MPS : 157.4333 MPS : 150.2365 MPS
cos : 162.1996 MPS : 166.1857 MPS : 161.8407 MPS
tan : 93.6082 MPS : 76.9096 MPS : 90.7496 MPS
asin : 124.9897 MPS : 114.8927 MPS : 128.0259 MPS
acos : 140.7629 MPS : 110.8343 MPS : 129.6873 MPS
atan : 120.1154 MPS : 79.6637 MPS : 130.1246 MPS
atan2 : 54.1519 MPS : 43.4398 MPS : 58.3252 MPS
System Libm is considerably faster than Openlibm on pow (4x) hypot, (1.5x) exp (1.5x) and log (1.7x), while Openlibm outperforms system Libm in almost all trigonometric functions (except cos).
Running on Kubuntu 21.04 (kernel 5.11.0-17-generic) in an Asus Rog Strix G531GT, CPU Intel Core i7-9750H CPU 2.60GHz.
Definitely useful but could you file as a separate issue?
Jessica, what do you mean by "system libm"? Is it glibc?
sorry there is no code available yet
now it is publicly available : https://gitlab.inria.fr/zimmerma/math_accuracy
Jessica, what do you mean by "system libm"? Is it glibc?
Yes, glibc.
Definitely useful but could you file as a separate issue?
Done. #234