Vc
Vc copied to clipboard
macOS top-half avx / avx2 registers aren't handled correctly
Vc version / revision | Operating System | Compiler & Version | Compiler Flags | Assembler & Version | CPU |
---|---|---|---|---|---|
1.4 | macOS 10.15.3 | Apple clang version 11.0.0 (clang-1100.0.33.17) | avx / avx2 | Apple clang version 11.0.0 (clang-1100.0.33.17) | 2,6GHz Dual-Core Intel Core i5 from MacBook Pro Retina, 13-inch, Mid 2014 |
In a project of ours that is using Vc 1.4 a colleague reported strange issues: The images we produce using Vc are striped, with the first four pixels looking good and then the following four pixels being black. Turns out, the code that's fine on most platforms and even on another macOS machine with a different CPU, behaves very wrong. I believe the existing test suite of Vc also shows that issue, see below.
Is this a macOS/clang bug, or something within Vc? Any way to workaround this?
Testcase
arithmetics_avx:
27: FAIL: ┍ at /Users/kdab/Vc/tests/arithmetics.cpp:231 (0x106d4beb4)):
27: FAIL: │ test * test - test ([16797700, 16797700, 16797700, 16797700, 16797700, 16797700, 16797700, 16797700]) ≈ Vec(j * j - j) ([16797702, 16797702, 16797702, 16797702, 16797702, 16797702, 16797702, 16797702]) -> m[0000 0000]
27: FAIL: │ distance: [-1, -1, -1, -1, -3.35544e+07, -3.35544e+07, -3.35544e+07, -3.35544e+07] ulp, allowed distance: ±1 ulp
27: FAIL: ┕ testMulSub<simd< float, AVX>>
arithmetics_avx2:
28: FAIL: ┍ at /Users/kdab/Vc/tests/arithmetics.cpp:231 (0x10ca05724)):
28: FAIL: │ test * test - test ([16797700, 16797700, 16797700, 16797700, 16797700, 16797700, 16797700, 16797700]) ≈ Vec(j * j - j) ([16797702, 16797702, 16797702, 16797702, 16797702, 16797702, 16797702, 16797702]) -> m[0000 0000]
28: FAIL: │ distance: [-1, -1, -1, -1, -3.35544e+07, -3.35544e+07, -3.35544e+07, -3.35544e+07] ulp, allowed distance: ±1 ulp
28: FAIL: ┕ testMulSub<simd< float, AVX>>
Actual Results
The following tests FAILED:
27 - arithmetics_avx (Failed)
28 - arithmetics_avx2 (Failed)
59 - ulp_avx (Failed)
60 - ulp_avx2 (Failed)
71 - math_avx (Failed)
72 - math_avx2 (Failed)
103 - gatherinterleavedmemory_avx (Failed)
104 - gatherinterleavedmemory_avx2 (Failed)```
Expected Results
no failures
Vc version / revision | Operating System | Compiler & Version | Compiler Flags | Assembler & Version | CPU |
---|---|---|---|---|---|
1.4 | macOS 10.15.3 | Apple clang version 11.0.0 (clang-1100.0.33.17) | avx / avx2 | Apple clang version 11.0.0 (clang-1100.0.33.17) | 4.01 GHz Quad-Core Intel Core i7 6700K |
Here is the complete test run output from my failing machine: test-mac.log
My MacBook Pro has no issues, all tests pass. Note the software stack is the same:
Vc version / revision | Operating System | Compiler & Version | Compiler Flags | Assembler & Version | CPU |
---|---|---|---|---|---|
1.4 | macOS 10.15.3 | Apple clang version 11.0.0 (clang-1100.0.33.17) | avx / avx2 | Apple clang version 11.0.0 (clang-1100.0.33.17) | MacBook Pro (Retina, 15-inch, Mid 2015) 2.5 GHz Intel Core i7 |
Note that on my failing machine, if I use gcc all tests pass:
Vc version / revision | Operating System | Compiler & Version | Compiler Flags | Assembler & Version | CPU |
---|---|---|---|---|---|
1.4 | macOS 10.15.3 | g++-9 (Homebrew GCC 9.2.0_3) 9.2.0 | avx / avx2 | g++-9 (Homebrew GCC 9.2.0_3) 9.2.0 | 4.01 GHz Quad-Core Intel Core i7 6700K |