sleef icon indicating copy to clipboard operation
sleef copied to clipboard

Incorrect Builds on macOS for Debug

Open miakramer opened this issue 4 years ago • 13 comments

On macOS (not sure about other OS, haven't tried), when building (with system clang) a Debug build, some of the AVX and AVX2 tests fail. The SSE tests pass, and my CPU doesn't have AVX512. RelWithDebInfo also seems to build correctly. The tests that fail are:

  • iutavx2
  • iutyavx2
  • iutavx
  • iutyavx
  • iutdsp256

The AVX/2 tests that pass are:

  • iutavx2128
  • iutyavx2128

Please let me know if there's any more information I can provide.

Output from setting -DSLEEF_SHOW_CONFIG:

-- Could NOT find OpenSSL, try to set the path to OpenSSL root folder in the system variable OPENSSL_ROOT_DIR (missing: OPENSSL_INCLUDE_DIR)
-- Configuring build for SLEEF-v3.4.0
   Target system: Darwin-19.3.0
   Target processor: x86_64
   Host system: Darwin-19.3.0
   Host processor: x86_64
   Detected C compiler: AppleClang @ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
-- Using option `-Wall -Wno-unused -Wno-attributes -Wno-unused-result -ffp-contract=off -fno-math-errno -fno-trapping-math` to compile libsleef
-- Building shared libs : ON
-- MPFR : /usr/local/lib/libmpfr.dylib
-- MPFR header file in /usr/local/include
-- GMP : /usr/local/lib/libgmp.dylib
-- RT :
-- FFTW3 : /usr/local/lib/libfftw3.dylib
-- OPENSSL :
-- SDE : SDE_COMMAND-NOTFOUND
-- RUNNING_ON_TRAVIS : 0
-- COMPILER_SUPPORTS_OPENMP :
-- Configuring done
-- Generating done
-- Build files have been written to: /Users/Mia/git/sleef/build

miakramer avatar Apr 07 '20 03:04 miakramer

Oh, I forgot to mention. I noticed in my ray tracer on a debug build: atan2_8f seems correct for the first four lanes, but always returns 0 in the last four. acos_8f is similar, returning 1.5707964 (presumably pi/2) in the last four.

miakramer avatar Apr 07 '20 03:04 miakramer

The problem with Debug build should be now fixed.

https://github.com/shibatch/sleef/tree/Fix_build_error_with_Debug_mode

As for atan2_8f, please show a code for demonstrating the bug.

shibatch avatar Apr 07 '20 03:04 shibatch

Sorry, the fix doesn't seem to be working. I haven't finished all of the tests again yet, but so far at least iutavx2 and iutyavx2 have failed.

For the functions I mentioned, I meant that seemed to be the broken behaviour. I'll see if I can make a small sample.

miakramer avatar Apr 07 '20 03:04 miakramer

Please tell me the model of your mac and the exact commands you gave.

shibatch avatar Apr 07 '20 04:04 shibatch

It should be easy to modify the sample source code at sleef.org.

https://sleef.org/hellox86.c

shibatch avatar Apr 07 '20 04:04 shibatch

It's a 2014 MacBook Pro, the CPU is an Intel i7-4870HQ. CPU features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C RDWRFSGS TSC_THREAD_OFFSET BMI1 AVX2 SMEP BMI2 ERMS INVPCID FPU_CSDS MDCLEAR IBRS STIBP L1DF SSBD

The commands I used:

git clone https://github.com/shibatch/sleef.git
cd sleef
git checkout origin/Fix_build_error_with_Debug_mode
mkdir build && cd build
cmake -DSLEEF_SHOW_CONFIG=TRUE -DCMAKE_BUILD_TYPE=Debug -DBUILD_DFT=FALSE ..
make -j 8
make test

miakramer avatar Apr 07 '20 04:04 miakramer

It does not reproduce on my environment.

shibatch avatar Apr 07 '20 04:04 shibatch

Please try once again from the beginning, and paste the result of "make test."

shibatch avatar Apr 07 '20 04:04 shibatch

Ok, I will try again.

Using the debug build that fails the test, I get this output:

Should be: 0.00;  0.45;  0.90;  1.35;  1.80;  2.24;  2.69;  3.14
Is:        0.00;  0.45;  0.90;  1.35;  1.57;  1.57;  1.57;  1.57

From this program:

#include <stdio.h>
#include <x86intrin.h>
#include <sleef.h>

int main(int argc, char **argv) {
  float in[] = {
    1.0,
    0.9009688679024191,
    0.6234898018587336,
    0.22252093395631445,
   -0.22252093395631434,
   -0.6234898018587335,
   -0.900968867902419,
   -1.0
  };
  float out[] = {
    0.0,
    0.4487989505128276,
    0.8975979010256552,
    1.3463968515384828,
    1.7951958020513104,
    2.243994752564138,
    2.6927937030769655,
    3.141592653589793
  };

  __m256 vin, vout;

  vin  = _mm256_loadu_ps(in);

  vout = Sleef_acosf8_u10avx2(vin);

  float res[8];

  _mm256_storeu_ps(res, vout);

  printf("Should be: %.2f;  %.2f;  %.2f;  %.2f;  %.2f;  %.2f;  %.2f;  %.2f\n", out[0], out[1], out[2], out[3], out[4], out[5], out[6], out[7]);
  printf("Is:        %.2f;  %.2f;  %.2f;  %.2f;  %.2f;  %.2f;  %.2f;  %.2f\n", res[0], res[1], res[2], res[3], res[4], res[5], res[6], res[7]);
}

miakramer avatar Apr 07 '20 04:04 miakramer

I think this is not a problem in sleef, and it does not reproduce on my environment.

shibatch avatar Apr 07 '20 04:04 shibatch

I tried again from scratch, and it seems to be Apple clang-specific. Apple clang Debug fails the same tests as above, but regular LLVM clang Debug passed them all. Apple clang Release does work, though. I'm assuming SLEEF doesn't do anything special for Apple vs. regular clang, so I guess a compiler bug is possible?

miakramer avatar Apr 07 '20 05:04 miakramer

It is hard to identify the cause. I also tried Apple clang.

-- The C compiler identification is AppleClang 9.0.0.9000039
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc - works

shibatch avatar Apr 07 '20 05:04 shibatch

I'll take a look in a few days when I have more time, but in the meantime I posted the generated LLVM from -DSLEEF_ENABLE_LLVM_BITCODE in case it's helpful. I can also upload the .dylib file somewhere if it would be helpful. These are .ll files, I just changed them to .txt so github would let me upload them.

miakramer avatar Apr 07 '20 17:04 miakramer