libm Potential to add #[inline] attributes where possible

Potential to add #[inline] attributes where possible

Open shssoichiro opened this issue 3 years ago • 2 comments

I noticed there is a PR #210 that removed most #[inline] annotations. However, in developing one of my crates, I found that Rust's stdlib cbrtf was a significant bottleneck. I attempted to use this crate, but it provided minimal benefit due to the functions not being inlined. The bottleneck was due to the fact that cbrtf was being called in a loop, and could not be autovectorized because it had to be accessed via a call instruction.

If I add the #[inline] attribute, my function is able to inline and autovectorize the code, which resulted in a 25% performance improvement. Therefore, I would move to consider readding the #[inline] attribute where possible. (#[inline(always)] is not needed, only #[inline] to allow the compiler to consider it.)

Edit: I tried doing the same with powf for other places it's used in the crate, and in that case it had a very negative effect... so this turns out to not be a universal thing at all :thinking:

Oct 24 '22 22:10 shssoichiro

If you've found a specific case where the lack of #[inline] causes a performance problem for you then we are happy to accept a PR to add the #[inline] back.

Oct 29 '22 19:10 Amanieu

I benchmarked every function with #[inline] vs. as-is, and there are a few considerable improvements. On my machine (x86_64 Arch Linux, Ryzen 7 5800H) I got these significant results - though I'd appreciate someone on a different system/architecture running a bench as well:

atan: 3 -> 2 ns/iter
atan2: 6 -> 5 ns/iter
cosf: 4 -> 2 ns/iter
coshf: 4 -> 3 ns/iter
expm1f: 3 -> 2 ns/iter
hypot: 4 -> 2 ns/iter
hypotf: 2 -> 1 ns/iter
sinf: 5 -> 2 ns/iter
tan: 4 -> 3 ns/iter
tanf: 7 -> 2 ns/iter
tanh: 8 -> 6 ns/iter

I only included results where #[inline] helped by >10% but I will do more runs and average out results before doing any proper PR, for now just want to put this out there

Dec 10 '22 21:12 nia-e

libm libm copied to clipboard

Potential to add #[inline] attributes where possible

libm
libm copied to clipboard