num-traits icon indicating copy to clipboard operation
num-traits copied to clipboard

Add rsqrt method to Float trait

Open cuviper opened this issue 7 years ago • 7 comments

From @davll on November 9, 2017 8:48

rsqrt is a widely used math function in game development, and is faster than combining the two functions: recip(sqrt(x)) thanks to x86 SSE instructions RSQRTSS, RSQRTPS, RSQRTSD, and RSQRTPD. Should we consider add rsqrt to Float trait?

  • x86/x86_64 SSE: _mm_rsqrt_ps (note that RSQRT instruction is approximate, less accurate than SQRT)
  • ARM Neon VRSQRTE
  • PowerPC Altivec: vec_rsqrte

Copied from original issue: rust-num/num#343

cuviper avatar Dec 19 '17 19:12 cuviper

This should probably be proposed for the standard library first. So far we don't have any arch-specific code -- I'd rather just have num's impl Float for f64 forward to an optimized/intrinsic call in std.

cuviper avatar Dec 19 '17 19:12 cuviper

From @clarcharr on November 10, 2017 1:8

I agree; it makes sense to offer this in libstd.

cuviper avatar Dec 19 '17 19:12 cuviper

From @davll on November 10, 2017 3:10

Reasonable, I'll propose it in rust-lang/rust. I'll keep the issue open as reminder.

cuviper avatar Dec 19 '17 19:12 cuviper

This appears to have been in libstd originally, then later removed. I don't think that anyone actually opened an issue for it, because I can't find it.

clarfonthey avatar Dec 21 '17 18:12 clarfonthey

rust-lang/rust#23549 added:

#[deprecated(since = "1.0.0", reason = "use self.sqrt().recip() instead")]

and then rust-lang/rust#24636 removed it. Neither PR mentioned rsqrt specifically.

If you propose it back to std, make sure to point out the possibility of intrinsics doing better than sqrt().recip(). However, you should also make sure that LLVM isn't already optimizing it to one op. It might not be able to do that because of the difference in precision, but it's worth checking.

cuviper avatar Dec 21 '17 20:12 cuviper

Looks like LLVM wouldn't optimize it automatically. Also it's probably worth following progress on fast math support in rust-lang/rust#21690, since LLVM may generate rsqrt when the corresponding code uses fdiv fast intrinsic, see https://github.com/llvm-mirror/llvm/blob/5e8f334dfad1ec4b8bcaae385d1c2598e18a03af/test/CodeGen/X86/sqrt-fastmath.ll#L245-L279

upsuper avatar Jun 16 '18 17:06 upsuper

This should probably be added to the Real trait at the same time

andrewhickman avatar Jul 27 '18 19:07 andrewhickman