xsimd icon indicating copy to clipboard operation
xsimd copied to clipboard

nearbyhint: Fix for ffast-math

Open RafaGago opened this issue 2 years ago • 3 comments

See commit text.

This fixes: #548

Notice that I wasn't able to verify it on Visual Studio, just g++ and clang.

RafaGago avatar Aug 31 '21 19:08 RafaGago

Can you give a try to the following patch instead: if it works, it's much less intrusive

diff --git a/include/xsimd/arch/generic/xsimd_generic_math.hpp b/include/xsimd/arch/generic/xsimd_generic_math.hpp
index 56e4d98..2ef41de 100644
--- a/include/xsimd/arch/generic/xsimd_generic_math.hpp
+++ b/include/xsimd/arch/generic/xsimd_generic_math.hpp
@@ -1722,8 +1722,8 @@ namespace xsimd {
         // to v. That's not what we want, so prevent compiler optimization here.
         // FIXME: it may be better to emit a memory barrier here (?).
 #ifdef __FAST_MATH__
-        volatile batch_type d0 = v + t2n;
-        batch_type d = *(batch_type*)(void*)(&d0) - t2n;
+        volatile auto d0 = (v + t2n).data;
+        batch_type d = batch_type(d0) - t2n;
 #else
         batch_type d0 = v + t2n;
         batch_type d = d0 - t2n;

serge-sans-paille avatar Aug 31 '21 20:08 serge-sans-paille

Now I'm at $DAILY_JOB I don't have the codebase at hand, but I'd say that local scope volatile variables are not enough to stop Clang. It seems that it sees through the intrinsic types, otherwise it woudn't have correctly applied the optimization.

The problem with the volatile approach, is that even if it worked now we don't know if a compiler update would silently break it.

Notice that to remove the volatile instead of doing "(batch_type)(void*)(&d0)" you can do "const_cast<batch_type const&> (d0)"

RafaGago avatar Sep 01 '21 06:09 RafaGago

#551 achieves the same result on my laptop, while being less intrusive on the codebase. Can you confirm it works for you too?

serge-sans-paille avatar Oct 12 '21 19:10 serge-sans-paille