sleef
sleef copied to clipboard
[3.3/x86_64] 8 tests failed out of 31
http://debomatic-amd64.debian.net/distribution#experimental/sleef/3.3-1/buildlog
/usr/bin/ctest --force-new-ctest-process -j8
Test project /<<PKGBUILDDIR>>/obj-x86_64-linux-gnu
Start 1: gnuabi_compatibility_SSE2
Start 2: gnuabi_compatibility_AVX
Start 3: gnuabi_compatibility_AVX2
Start 4: gnuabi_compatibility_AVX512F
Start 5: gnuabi_compatibility_AVX512F_masked
Start 6: naivetestdp_1
Start 7: naivetestdp_2
Start 8: naivetestdp_3
1/31 Test #1: gnuabi_compatibility_SSE2 ............. Passed 0.01 sec
Start 9: naivetestdp_4
2/31 Test #2: gnuabi_compatibility_AVX .............. Passed 0.02 sec
Start 10: naivetestdp_5
3/31 Test #3: gnuabi_compatibility_AVX2 ............. Passed 0.02 sec
Start 11: naivetestdp_10
4/31 Test #4: gnuabi_compatibility_AVX512F .......... Passed 0.03 sec
Start 12: naivetestsp_1
5/31 Test #5: gnuabi_compatibility_AVX512F_masked ... Passed 0.05 sec
Start 13: naivetestsp_2
6/31 Test #6: naivetestdp_1 ......................... Passed 0.05 sec
Start 14: naivetestsp_3
7/31 Test #7: naivetestdp_2 ......................... Passed 0.05 sec
Start 15: naivetestsp_4
8/31 Test #12: naivetestsp_1 ......................... Passed 0.02 sec
Start 16: naivetestsp_5
9/31 Test #8: naivetestdp_3 ......................... Passed 0.09 sec
Start 17: naivetestsp_10
10/31 Test #13: naivetestsp_2 ......................... Passed 0.09 sec
Start 18: roundtriptest1ddp_12
11/31 Test #9: naivetestdp_4 ......................... Passed 0.15 sec
Start 19: roundtriptest1ddp_16
12/31 Test #10: naivetestdp_5 ......................... Passed 0.17 sec
Start 20: roundtriptest1dsp_12
13/31 Test #15: naivetestsp_4 ......................... Passed 0.17 sec
Start 21: roundtriptest1dsp_16
14/31 Test #14: naivetestsp_3 ......................... Passed 0.18 sec
Start 22: roundtriptest2ddp_2_2
15/31 Test #16: naivetestsp_5 ......................... Passed 0.23 sec
Start 23: roundtriptest2ddp_4_4
16/31 Test #22: roundtriptest2ddp_2_2 ................. Passed 0.81 sec
Start 24: roundtriptest2ddp_8_8
17/31 Test #17: naivetestsp_10 ........................ Passed 1.02 sec
Start 25: roundtriptest2ddp_10_10
18/31 Test #11: naivetestdp_10 ........................ Passed 1.23 sec
Start 26: roundtriptest2ddp_5_15
19/31 Test #25: roundtriptest2ddp_10_10 ...............***Failed 0.73 sec
Path(random) :1(ST) 4(ST) 2(ST) 2(ST) 1(ST)
ISA : AVX2 256 bit double
transpose NoMT(measured): 63591
transpose MT(measured): 173729
Path(random) :2(ST) 4(ST) 4(ST)
ISA : AVX2 256 bit double
transpose NoMT(loaded): 63591
transpose MT(loaded): 173729
complex : NG (0.855868)
Start 27: roundtriptest2dsp_2_2
20/31 Test #26: roundtriptest2ddp_5_15 ................***Failed 1.08 sec
Path(random) :1(ST) 4(ST) 2(ST) 2(ST) 3(ST) 1(ST) 1(ST) 1(ST)
ISA : AVX2 256 bit double
Path(random) :1(ST) 2(ST) 2(ST)
ISA : AVX2 256 bit double
transpose NoMT(measured): 76946
transpose MT(measured): 174351
Path(random) :3(ST) 1(ST) 3(ST) 3(ST) 1(ST) 2(ST) 2(ST)
ISA : AVX2 256 bit double
Path(random) :4(ST) 1(ST)
ISA : AVX2 256 bit double
transpose NoMT(loaded): 76946
transpose MT(loaded): 174351
complex : NG (0.861336)
Start 28: roundtriptest2dsp_4_4
21/31 Test #20: roundtriptest1dsp_12 .................. Passed 2.41 sec
Start 29: roundtriptest2dsp_8_8
22/31 Test #18: roundtriptest1ddp_12 .................. Passed 2.50 sec
Start 30: roundtriptest2dsp_10_10
23/31 Test #27: roundtriptest2dsp_2_2 ................. Passed 0.89 sec
Start 31: roundtriptest2dsp_5_15
24/31 Test #24: roundtriptest2ddp_8_8 .................***Failed 2.07 sec
Path(random) :3(ST) 2(ST) 3(ST)
ISA : AVX2 256 bit double
transpose NoMT(measured): 22646
transpose MT(measured): 1810050
Path(random) :1(ST) 2(ST) 3(ST) 2(ST)
ISA : AVX2 256 bit double
transpose NoMT(loaded): 22646
transpose MT(loaded): 1810050
complex : NG (0.85523)
25/31 Test #30: roundtriptest2dsp_10_10 ...............***Failed 0.54 sec
Path(random) :3(ST) 2(ST) 2(ST) 3(ST)
ISA : AVX2 256 bit float
transpose NoMT(measured): 36704
transpose MT(measured): 166311
Path(random) :2(ST) 2(ST) 3(ST) 3(ST)
ISA : AVX2 256 bit float
transpose NoMT(loaded): 36704
transpose MT(loaded): 166311
complex : NG (1.72439)
26/31 Test #31: roundtriptest2dsp_5_15 ................***Failed 0.61 sec
Path(random) :3(ST) 2(ST) 2(ST) 4(ST) 4(ST)
ISA : AVX2 256 bit float
Path(random) :2(ST) 3(ST)
ISA : AVX2 256 bit float
transpose NoMT(measured): 34700
transpose MT(measured): 119216
Path(random) :3(ST) 4(ST) 2(ST) 3(ST) 3(ST)
ISA : AVX2 256 bit float
Path(random) :2(ST) 3(ST)
ISA : AVX2 256 bit float
transpose NoMT(loaded): 34700
transpose MT(loaded): 119216
complex : NG (1.72819)
27/31 Test #29: roundtriptest2dsp_8_8 .................***Failed 2.05 sec
Path(random) :4(ST) 4(ST)
ISA : AVX2 256 bit float
transpose NoMT(measured): 13515
transpose MT(measured): 1884567
Path(random) :3(ST) 3(ST) 2(ST)
ISA : AVX2 256 bit float
transpose NoMT(loaded): 13515
transpose MT(loaded): 1884567
complex : NG (1.71725)
28/31 Test #21: roundtriptest1dsp_16 .................. Passed 4.78 sec
29/31 Test #19: roundtriptest1ddp_16 .................. Passed 4.99 sec
30/31 Test #28: roundtriptest2dsp_4_4 .................***Failed 15.35 sec
Path(random) :2(ST) 2(ST)
ISA : AVX2 256 bit float
transpose NoMT(measured): 5788
transpose MT(measured): 15256963
Path(random) :2(ST) 2(ST)
ISA : AVX2 256 bit float
transpose NoMT(loaded): 5788
transpose MT(loaded): 15256963
complex : NG (0.958348)
31/31 Test #23: roundtriptest2ddp_4_4 .................***Failed 17.41 sec
Path(random) :2(ST) 2(ST)
ISA : AVX2 256 bit double
transpose NoMT(measured): 15802
transpose MT(measured): 17304167
Path(random) :2(ST) 2(ST)
ISA : AVX2 256 bit double
transpose NoMT(loaded): 15802
transpose MT(loaded): 17304167
complex : NG (0.869041)
74% tests passed, 8 tests failed out of 31
Total Test time (real) = 17.71 sec
The following tests FAILED:
23 - roundtriptest2ddp_4_4 (Failed)
24 - roundtriptest2ddp_8_8 (Failed)
25 - roundtriptest2ddp_10_10 (Failed)
26 - roundtriptest2ddp_5_15 (Failed)
28 - roundtriptest2dsp_4_4 (Failed)
29 - roundtriptest2dsp_8_8 (Failed)
30 - roundtriptest2dsp_10_10 (Failed)
31 - roundtriptest2dsp_5_15 (Failed)
Errors while running CTest
build flags
dh_auto_configure -- \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DSLEEF_TEST_ALL_IUT=ON
cd obj-x86_64-linux-gnu && cmake -DCMAKE_INSTALL_PREFIX=/usr -DCMAKE_VERBOSE_MAKEFILE=ON -DCMAKE_BUILD_TYPE=None -DCMAKE_INSTALL_SYSCONFDIR=/etc -DCMAKE_INSTALL_LOCALSTATEDIR=/var -DCMAKE_EXPORT_NO_PACKAGE_REGISTRY=ON -DCMAKE_FIND_PACKAGE_NO_PACKAGE_REGISTRY=ON -DCMAKE_INSTALL_RUNSTATEDIR=/run "-GUnix Makefiles" -DCMAKE_BUILD_TYPE=RelWithDebInfo -DSLEEF_TEST_ALL_IUT=ON ..
machine configuration https://github.com/intel/mkl-dnn/issues/208#issuecomment-377965291
Hello @cdluminate,
I greatly appreciate that you work on debian packaging. Thank you for your report.
I haven't seen an error like this for some time. It is hard to debug since I don't have access to that computer. How about disabling DFT library for now? It is not used by any project yet. If that is acceptable, please specify -DBUILT_DFT=FALSE as a CMake option.
@shibatch Thanks for the hint. Nothing breaks if I disabled libsleefdft
http://debomatic-amd64.debian.net/distribution#experimental/sleef/3.3-1/buildlog
I checked pytorch's code and there is no keyword SleefDFT
, so I thinks it's fine if we disable it.
I don't have shell access to that machine too.
@shibatch I have the same results with my laptop, if there is something I can do to help debug this, let me know.
@btashton Thank you! Can I see the full build log?
@shibatch here is the build log including running the tests:
https://gist.github.com/btashton/1f4ccfd27244100560d4ec010f5201a9
Could you also try compiling and testing with clang?
They all pass with clang.
@btashton And, please also let me know of the system configuration, which is OS version, CPU model, etc.
Fedora 28, the details are listed here:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 61
Model name: Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
Stepping: 4
CPU MHz: 1095.807
CPU max MHz: 2700.0000
CPU min MHz: 500.0000
BogoMIPS: 4390.15
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
NUMA node0 CPU(s): 0-3
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap intel_pt xsaveopt dtherm ida arat pln pts flush_l1d
[bashton@localhost build]$ uname -a
Linux localhost.localdomain 4.17.14-202.fc28.x86_64 #1 SMP Wed Aug 15 12:29:25 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux```
Do you have other versions of gcc installed on your computer? If so, please try testing with those versions.
And please try testing once again after executing the following command.
export OMP_WAIT_POLICY=passive
Unfortunately the only easy version of gcc for me to install right now is 3.4 since the distro includes it as a compat package and this library will not build due to some gcc flags. I did try setting the OMP_WAIT_POLICY and it did not seem to have any effect.
Okay, I have been suspecting libgomp since the beginning. It's only 2D DFT that is failing, and the difference between 1D DFT and 2D DFT is pretty simple, though the compiler generates a complex code.
It is possible to check the failing part with gdb, but I think that will not provide very useful information.
It is still difficult to make sure that something is wrong with libgomp or gcc itself.
This is interesting. I'll compare clang and gcc results too.
@cdluminate Is it gcc-8 that was used to build the failing tests at the server? I cannot see the log anymore.
@shibatch I set up a script to run again the official GCC docker images for 4.9, 5.5, 6.4, 7.3, 8.2 and I could not reproduce this failure on the same hardware. Any other thoughts?
@shibatch It should be gcc8. Debian unstable had been shipping with gcc-8 as the defualt compiler for some time.
@btashton No, I have no idea at all. In my CI environment, it is tested with gcc-4, gcc-7, gcc-8 in addition to clang, icc and MSVC. It seems that the problem only occurs with x86 and gcc-8.
And it is not always problematic with gcc-8. I don't have any problem with that combination.
I can also reproduce.
$ make test
Running tests...
Test project /home/chriselrod/Documents/libraries/sleef/build
Start 1: gnuabi_compatibility_SSE2
1/31 Test #1: gnuabi_compatibility_SSE2 ............. Passed 0.00 sec
Start 2: gnuabi_compatibility_AVX
2/31 Test #2: gnuabi_compatibility_AVX .............. Passed 0.00 sec
Start 3: gnuabi_compatibility_AVX2
3/31 Test #3: gnuabi_compatibility_AVX2 ............. Passed 0.00 sec
Start 4: gnuabi_compatibility_AVX512F
4/31 Test #4: gnuabi_compatibility_AVX512F .......... Passed 0.00 sec
Start 5: gnuabi_compatibility_AVX512F_masked
5/31 Test #5: gnuabi_compatibility_AVX512F_masked ... Passed 0.00 sec
Start 6: naivetestdp_1
6/31 Test #6: naivetestdp_1 ......................... Passed 0.00 sec
Start 7: naivetestdp_2
7/31 Test #7: naivetestdp_2 ......................... Passed 0.01 sec
Start 8: naivetestdp_3
8/31 Test #8: naivetestdp_3 ......................... Passed 0.01 sec
Start 9: naivetestdp_4
9/31 Test #9: naivetestdp_4 ......................... Passed 0.00 sec
Start 10: naivetestdp_5
10/31 Test #10: naivetestdp_5 ......................... Passed 0.01 sec
Start 11: naivetestdp_10
11/31 Test #11: naivetestdp_10 ........................ Passed 0.28 sec
Start 12: naivetestsp_1
12/31 Test #12: naivetestsp_1 ......................... Passed 0.00 sec
Start 13: naivetestsp_2
13/31 Test #13: naivetestsp_2 ......................... Passed 0.01 sec
Start 14: naivetestsp_3
14/31 Test #14: naivetestsp_3 ......................... Passed 0.00 sec
Start 15: naivetestsp_4
15/31 Test #15: naivetestsp_4 ......................... Passed 0.01 sec
Start 16: naivetestsp_5
16/31 Test #16: naivetestsp_5 ......................... Passed 0.00 sec
Start 17: naivetestsp_10
17/31 Test #17: naivetestsp_10 ........................ Passed 0.27 sec
Start 18: roundtriptest1ddp_12
18/31 Test #18: roundtriptest1ddp_12 .................. Passed 0.12 sec
Start 19: roundtriptest1ddp_16
19/31 Test #19: roundtriptest1ddp_16 .................. Passed 1.35 sec
Start 20: roundtriptest1dsp_12
20/31 Test #20: roundtriptest1dsp_12 .................. Passed 0.10 sec
Start 21: roundtriptest1dsp_16
21/31 Test #21: roundtriptest1dsp_16 .................. Passed 1.12 sec
Start 22: roundtriptest2ddp_2_2
22/31 Test #22: roundtriptest2ddp_2_2 ................. Passed 0.04 sec
Start 23: roundtriptest2ddp_4_4
23/31 Test #23: roundtriptest2ddp_4_4 .................***Failed 0.13 sec
Start 24: roundtriptest2ddp_8_8
24/31 Test #24: roundtriptest2ddp_8_8 .................***Failed 0.03 sec
Start 25: roundtriptest2ddp_10_10
25/31 Test #25: roundtriptest2ddp_10_10 ...............***Failed 0.10 sec
Start 26: roundtriptest2ddp_5_15
26/31 Test #26: roundtriptest2ddp_5_15 ................***Failed 0.17 sec
Start 27: roundtriptest2dsp_2_2
27/31 Test #27: roundtriptest2dsp_2_2 ................. Passed 0.05 sec
Start 28: roundtriptest2dsp_4_4
28/31 Test #28: roundtriptest2dsp_4_4 .................***Failed 0.13 sec
Start 29: roundtriptest2dsp_8_8
29/31 Test #29: roundtriptest2dsp_8_8 .................***Failed 0.02 sec
Start 30: roundtriptest2dsp_10_10
30/31 Test #30: roundtriptest2dsp_10_10 ...............***Failed 0.08 sec
Start 31: roundtriptest2dsp_5_15
31/31 Test #31: roundtriptest2dsp_5_15 ................***Failed 0.13 sec
74% tests passed, 8 tests failed out of 31
Total Test time (real) = 4.21 sec
The following tests FAILED:
23 - roundtriptest2ddp_4_4 (Failed)
24 - roundtriptest2ddp_8_8 (Failed)
25 - roundtriptest2ddp_10_10 (Failed)
26 - roundtriptest2ddp_5_15 (Failed)
28 - roundtriptest2dsp_4_4 (Failed)
29 - roundtriptest2dsp_8_8 (Failed)
30 - roundtriptest2dsp_10_10 (Failed)
31 - roundtriptest2dsp_5_15 (Failed)
Errors while running CTest
make: *** [Makefile:95: test] Error 8
I am on Fedora 29beta, with gcc 8.2.1, glibc 2.28, and an x86 processor (with avx-512). cmake call:
cmake -DCMAKE_BUILD_TYPE=Release -DBUILT_DFT=FALSE ..
When I tried:
cmake -DCMAKE_BUILD_TYPE=Debug -DBUILT_DFT=FALSE ..
make fails with:
/usr/bin/ld: ../../lib/libsleefgnuabi.so.3.3: undefined reference to `Sleef_x86CpuID'
collect2: error: ld returned 1 exit status
make[2]: *** [src/libm-tester/CMakeFiles/gnuabi_compatibility_AVX.dir/build.make:86: bin/gnuabi_compatibility_AVX] Error 1
make[1]: *** [CMakeFiles/Makefile2:3018: src/libm-tester/CMakeFiles/gnuabi_compatibility_AVX.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
/usr/bin/ld: ../../lib/libsleefgnuabi.so.3.3: undefined reference to `Sleef_x86CpuID'
collect2: error: ld returned 1 exit status
/usr/bin/ld: ../../lib/libsleefgnuabi.so.3.3: undefined reference to `Sleef_x86CpuID'
collect2: error: ld returned 1 exit status
/usr/bin/ld: ../../lib/libsleefgnuabi.so.3.3: undefined reference to `Sleef_x86CpuID'
collect2: error: ld returned 1 exit status
make[2]: *** [src/libm-tester/CMakeFiles/gnuabi_compatibility_AVX2.dir/build.make:86: bin/gnuabi_compatibility_AVX2] Error 1
make[2]: *** [src/libm-tester/CMakeFiles/gnuabi_compatibility_SSE2.dir/build.make:86: bin/gnuabi_compatibility_SSE2] Error 1
make[1]: *** [CMakeFiles/Makefile2:3245: src/libm-tester/CMakeFiles/gnuabi_compatibility_AVX2.dir/all] Error 2
make[1]: *** [CMakeFiles/Makefile2:2374: src/libm-tester/CMakeFiles/gnuabi_compatibility_SSE2.dir/all] Error 2
make[2]: *** [src/libm-tester/CMakeFiles/gnuabi_compatibility_AVX512F.dir/build.make:86: bin/gnuabi_compatibility_AVX512F] Error 1
make[1]: *** [CMakeFiles/Makefile2:2449: src/libm-tester/CMakeFiles/gnuabi_compatibility_AVX512F.dir/all] Error 2
/usr/bin/ld: ../../lib/libsleefgnuabi.so.3.3: undefined reference to `Sleef_x86CpuID'
collect2: error: ld returned 1 exit status
make[2]: *** [src/libm-tester/CMakeFiles/gnuabi_compatibility_AVX512F_masked.dir/build.make:86: bin/gnuabi_compatibility_AVX512F_masked] Error 1
make[1]: *** [CMakeFiles/Makefile2:2337: src/libm-tester/CMakeFiles/gnuabi_compatibility_AVX512F_masked.dir/all] Error 2
[ 64%] Built target sleefsse2
[ 64%] Built target sleefavx512fnofma
make: *** [Makefile:141: all] Error 2
EDIT: Things would've gone better had I been able to spell "build" correctly the first time. =P
@chriselrod Thank you for your report. It is known problem that build fails when you specify -DCMAKE_BUILD_TYPE=Debug
to build GNUABI libs. Please try something like
cmake -DCMAKE_BUILD_TYPE=Debug -DBUILT_DFT=TRUE -DBUILD_GNUABI_LIBS=FALSE ..
So, it seems to have something to do with gcc-8, and it is likely to reproduce on Fedora. I will install Fedora to my computer and test it.
I can reproduce something similar—perhaps the same problem?—if I build with the CFLAGS
typically used for RPM packaging on Fedora 32. You can check these with rpm -E '%set_build_flags'
:
CFLAGS="${CFLAGS:--O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection}" ; export CFLAGS ;
CXXFLAGS="${CXXFLAGS:--O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection}" ; export CXXFLAGS ;
FFLAGS="${FFLAGS:--O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -I/usr/lib64/gfortran/modules}" ; export FFLAGS ;
FCFLAGS="${FCFLAGS:--O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -m64 -mtune=generic -fasynchronous-unwind-tables -fstack-clash-protection -fcf-protection -I/usr/lib64/gfortran/modules}" ; export FCFLAGS ;
LDFLAGS="${LDFLAGS:--Wl,-z,relro -Wl,--as-needed -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld}" ; export LDFLAGS ;
LT_SYS_LIBRARY_PATH="${LT_SYS_LIBRARY_PATH:-/usr/lib64:}" ; export LT_SYS_LIBRARY_PATH
I am invoking cmake the way the %cmake
, %cmake_build
, and %ctest
RPM macros would, except that I am requesting the Ninja backend since the documentation says that is needed for a parallel build:
# first set CFLAGS, LDFLAGS, etc. as above, then:
/usr/bin/cmake \
\
\
-DCMAKE_C_FLAGS_RELEASE:STRING="-DNDEBUG" \
-DCMAKE_CXX_FLAGS_RELEASE:STRING="-DNDEBUG" \
-DCMAKE_Fortran_FLAGS_RELEASE:STRING="-DNDEBUG" \
-DCMAKE_VERBOSE_MAKEFILE:BOOL=ON \
-DCMAKE_INSTALL_PREFIX:PATH=/usr \
-DINCLUDE_INSTALL_DIR:PATH=/usr/include \
-DLIB_INSTALL_DIR:PATH=/usr/lib64 \
-DSYSCONF_INSTALL_DIR:PATH=/etc \
-DSHARE_INSTALL_PREFIX:PATH=/usr/share \
-DLIB_SUFFIX=64 \
-DBUILD_SHARED_LIBS:BOOL=ON \
-GNinja \
../sleef
/usr/bin/cmake --build "." -j4 --verbose
/usr/bin/ctest --output-on-failure --force-new-ctest-process -j4 --verbose
Here is what I see:
The following tests FAILED:
55 - fftwtest2ddp_4_4 (Failed)
56 - fftwtest2ddp_8_8 (Failed)
57 - fftwtest2ddp_10_10 (Failed)
58 - fftwtest2ddp_5_15 (Failed)
60 - fftwtest2dsp_4_4 (Failed)
61 - fftwtest2dsp_8_8 (Failed)
62 - fftwtest2dsp_10_10 (Failed)
63 - fftwtest2dsp_5_15 (Failed)
Changing --O2
to --O1
or --Og
in CFLAGS
causes the tests to pass. Strangely, so does changing it to --O3
. The --
instead of -
is a bit of macro fussiness, and the actual compiler flag is -O2
, -O1
, etc.
The machine I am testing on is an ancient x86_64 box that only supports SSE2. Fedora 32 currently has cmake 3.17.4, gcc-10.2.1, and I used sleef cc4b0213f2f57a2f7e8f6355758dc40973ae9998.
I don’t know if this sheds any light on anything or not. Particularly, I do not know if @cdluminate was explicitly setting build flags in this manner or not. I am happy to run any tests on other versions of Fedora, CentOS, etc. I can also try it with a VM on a machine that supports AVX2 if it matters.
It turns out the details about the flags added for RPM packaging on Fedora are not relevant. The following fails in the same way on Fedora 32, in an empty build directory and with no special environment variables set:
/usr/bin/cmake -DBUILD_SHARED_LIBS:BOOL=ON -GNinja ../sleef &&
/usr/bin/cmake --build "." -j4 --verbose &&
/usr/bin/ctest --output-on-failure --force-new-ctest-process -j4 --verbose
How about just turning off DFT? No one is using DFT, so that should be okay.
Out of curiosity, have you benchmarked against fftw?
Yes. See the benchmark. In some cases, it’s better than FFTW. It has still problems in planner, though. I have to manually specify the plan to maximize the performance.
I just tried this again with sleef 3.6, using the latest patched GCC 14.0.1 in Fedora Rawhide. It looks like the DFT tests are passing now (on x86_64
, aarch64
, ppc64le
, and s390x
).