sleef
sleef copied to clipboard
Incorrect Builds on macOS for Debug
On macOS (not sure about other OS, haven't tried), when building (with system clang) a Debug
build, some of the AVX and AVX2 tests fail. The SSE tests pass, and my CPU doesn't have AVX512. RelWithDebInfo
also seems to build correctly. The tests that fail are:
- iutavx2
- iutyavx2
- iutavx
- iutyavx
- iutdsp256
The AVX/2 tests that pass are:
- iutavx2128
- iutyavx2128
Please let me know if there's any more information I can provide.
Output from setting -DSLEEF_SHOW_CONFIG
:
-- Could NOT find OpenSSL, try to set the path to OpenSSL root folder in the system variable OPENSSL_ROOT_DIR (missing: OPENSSL_INCLUDE_DIR)
-- Configuring build for SLEEF-v3.4.0
Target system: Darwin-19.3.0
Target processor: x86_64
Host system: Darwin-19.3.0
Host processor: x86_64
Detected C compiler: AppleClang @ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
-- Using option `-Wall -Wno-unused -Wno-attributes -Wno-unused-result -ffp-contract=off -fno-math-errno -fno-trapping-math` to compile libsleef
-- Building shared libs : ON
-- MPFR : /usr/local/lib/libmpfr.dylib
-- MPFR header file in /usr/local/include
-- GMP : /usr/local/lib/libgmp.dylib
-- RT :
-- FFTW3 : /usr/local/lib/libfftw3.dylib
-- OPENSSL :
-- SDE : SDE_COMMAND-NOTFOUND
-- RUNNING_ON_TRAVIS : 0
-- COMPILER_SUPPORTS_OPENMP :
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/Mia/git/sleef/build
Oh, I forgot to mention. I noticed in my ray tracer on a debug build: atan2_8f
seems correct for the first four lanes, but always returns 0 in the last four. acos_8f
is similar, returning 1.5707964 (presumably pi/2) in the last four.
The problem with Debug build should be now fixed.
https://github.com/shibatch/sleef/tree/Fix_build_error_with_Debug_mode
As for atan2_8f, please show a code for demonstrating the bug.
Sorry, the fix doesn't seem to be working. I haven't finished all of the tests again yet, but so far at least iutavx2
and iutyavx2
have failed.
For the functions I mentioned, I meant that seemed to be the broken behaviour. I'll see if I can make a small sample.
Please tell me the model of your mac and the exact commands you gave.
It should be easy to modify the sample source code at sleef.org.
https://sleef.org/hellox86.c
It's a 2014 MacBook Pro, the CPU is an Intel i7-4870HQ. CPU features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C RDWRFSGS TSC_THREAD_OFFSET BMI1 AVX2 SMEP BMI2 ERMS INVPCID FPU_CSDS MDCLEAR IBRS STIBP L1DF SSBD
The commands I used:
git clone https://github.com/shibatch/sleef.git
cd sleef
git checkout origin/Fix_build_error_with_Debug_mode
mkdir build && cd build
cmake -DSLEEF_SHOW_CONFIG=TRUE -DCMAKE_BUILD_TYPE=Debug -DBUILD_DFT=FALSE ..
make -j 8
make test
It does not reproduce on my environment.
Please try once again from the beginning, and paste the result of "make test."
Ok, I will try again.
Using the debug build that fails the test, I get this output:
Should be: 0.00; 0.45; 0.90; 1.35; 1.80; 2.24; 2.69; 3.14
Is: 0.00; 0.45; 0.90; 1.35; 1.57; 1.57; 1.57; 1.57
From this program:
#include <stdio.h>
#include <x86intrin.h>
#include <sleef.h>
int main(int argc, char **argv) {
float in[] = {
1.0,
0.9009688679024191,
0.6234898018587336,
0.22252093395631445,
-0.22252093395631434,
-0.6234898018587335,
-0.900968867902419,
-1.0
};
float out[] = {
0.0,
0.4487989505128276,
0.8975979010256552,
1.3463968515384828,
1.7951958020513104,
2.243994752564138,
2.6927937030769655,
3.141592653589793
};
__m256 vin, vout;
vin = _mm256_loadu_ps(in);
vout = Sleef_acosf8_u10avx2(vin);
float res[8];
_mm256_storeu_ps(res, vout);
printf("Should be: %.2f; %.2f; %.2f; %.2f; %.2f; %.2f; %.2f; %.2f\n", out[0], out[1], out[2], out[3], out[4], out[5], out[6], out[7]);
printf("Is: %.2f; %.2f; %.2f; %.2f; %.2f; %.2f; %.2f; %.2f\n", res[0], res[1], res[2], res[3], res[4], res[5], res[6], res[7]);
}
I think this is not a problem in sleef, and it does not reproduce on my environment.
I tried again from scratch, and it seems to be Apple clang-specific. Apple clang Debug
fails the same tests as above, but regular LLVM clang Debug
passed them all. Apple clang Release
does work, though. I'm assuming SLEEF doesn't do anything special for Apple vs. regular clang, so I guess a compiler bug is possible?
It is hard to identify the cause. I also tried Apple clang.
-- The C compiler identification is AppleClang 9.0.0.9000039
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc - works
I'll take a look in a few days when I have more time, but in the meantime I posted the generated LLVM from -DSLEEF_ENABLE_LLVM_BITCODE
in case it's helpful. I can also upload the .dylib
file somewhere if it would be helpful. These are .ll
files, I just changed them to .txt
so github would let me upload them.