`std.int128` unittest failure with `-O1 -mcpu=x86-64-v3` since #4892
1.41.0-beta1 is failing one of its unittests for me:
$ ctest -V -R std.int128
UpdateCTestConfiguration from :/home/happy/tmp/ldc-build/DartConfiguration.tcl
UpdateCTestConfiguration from :/home/happy/tmp/ldc-build/DartConfiguration.tcl
Test project /home/happy/tmp/ldc-build
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 370
Start 370: std.int128-shared
370: Test command: /home/happy/tmp/ldc-build/runtime/phobos2-test-runner-shared "std.int128"
370: Working Directory: /home/happy/tmp/ldc-build/runtime
370: Test timeout computed to be: 10000000
370: ****** FAIL release64 std.int128
370: core.exception.AssertError@std/int128.d(574): Int128(Cent(14, 0)) != Int128(Cent(15, 0))
370: ----------------
370: ??:? _d_assert_msg [0x7f703c761383]
370: ??:? pure nothrow @nogc @safe void std.int128.__unittest_L521_C1() [0x7f703d7507c7]
370: ??:? [0x7f703d751a8f]
370: ??:? [0x5632a7c2c697]
370: ??:? [0x5632a7c2c53f]
370: ??:? [0x5632a7c2c451]
370: ??:? runModuleUnitTests [0x7f703c784173]
370: ??:? void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).runAll() [0x7f703c79f78a]
370: ??:? _d_run_main2 [0x7f703c79f5a6]
370: ??:? _d_run_main [0x7f703c79f38c]
370: ??:? [0x7f703c4bd16d]
370: ??:? __libc_start_main [0x7f703c4bd228]
370: ??:? [0x5632a7c2c1c4]
1/2 Test #370: std.int128-shared ................***Failed 0.01 sec
test 813
Start 813: std.int128-debug-shared
813: Test command: /home/happy/tmp/ldc-build/runtime/phobos2-test-runner-debug-shared "std.int128"
813: Working Directory: /home/happy/tmp/ldc-build/runtime
813: Test timeout computed to be: 10000000
813: 0.000s PASS debug64 std.int128
2/2 Test #813: std.int128-debug-shared .......... Passed 0.01 sec
The following tests passed:
std.int128-debug-shared
50% tests passed, 1 tests failed out of 2
Total Test time (real) = 0.03 sec
The following tests FAILED:
370 - std.int128-shared (Failed)
Errors while running CTest
Output from these tests are in: /home/happy/tmp/ldc-build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
I've configured the project with -DD_FLAGS_RELEASE="-O1;-mcpu=x86-64-v3"
Reduced:
import std.int128;
void main() {
auto c = Int128(5, 6);
c *= Int128(10, 20);
c /= Int128(10, 20);
assert(c == Int128(0, 15));
}
Works fine with -O or -mcpu=x86-64-v3 alone, but fails with -O -mcpu=x86-64-v3, at least with LLVM 19 as used by the v1.41.0-beta1 package: https://d.godbolt.org/z/nP93WM6jK. The earlier multiplication is mandatory to keep it failing. - If this isn't an LLVM bug, it might be caused by incomplete inline-asm clobbers in https://github.com/dlang/dmd/blob/5aee9a9eb57d2566eca8c05ebbd9a798bd3645ea/druntime/src/core/int128.d#L640-L646.
Seems to work with LLVM 20, at least the isolated test case.
Edit: Hmm not sure, as I can't reproduce the failure on my Intel Raptorlake CPU, when using the prebuilt v1.41.0-beta1 binaries...
I'm on an AMD Ryzen 7 5825U if it matters
LLVM 15 is failing
The prebuilt packages are failing for me too so the flags are enough to be applied to the test program, not the entire runtime build.
How's -mcpu=native for you? That made it work again on godbolt (once), but I'm not sure the godbolt runner CPUs are stable.
-mcpu=native is what originally made it fail but I tried to trim down the attributes:
$ ldc2 -mcpu=native -vv
Targeting 'x86_64-pc-linux-gnu' (CPU 'znver3' with features '+prfchw,-cldemote,+avx,+aes,+sahf,+pclmul,-xop,+crc32,+xsaves,-avx512fp16,-usermsr,-sm4,-egpr,+sse4.1,-avx512ifma,+xsave,+sse4.2,-tsxldtrk,-sm3,-ptwrite,-widekl,+invpcid,+64bit,+xsavec,-avx10.1-512,-avx512vpopcntdq,+cmov,-avx512vp2intersect,-avx512cd,+movbe,-avxvnniint8,-ccmp,-amx-int8,-kl,-avx10.1-256,-sha512,-avxvnni,-rtm,+adx,+avx2,-hreset,-movdiri,-serialize,+vpclmulqdq,-avx512vl,-uintr,-cf,+clflushopt,-raoint,-cmpccxadd,+bmi,-amx-tile,+sse,-gfni,-avxvnniint16,-amx-fp16,-ndd,+xsaveopt,+rdrnd,-avx512f,-amx-bf16,-avx512bf16,-avx512vnni,-push2pop2,+cx8,-avx512bw,+sse3,+pku,+fsgsbase,+clzero,+mwaitx,-lwp,+lzcnt,+sha,-movdir64b,-ppx,+wbnoinvd,-enqcmd,-avxneconvert,-tbm,-pconfig,-amx-complex,+ssse3,+cx16,+bmi2,+fma,+popcnt,-avxifma,+f16c,-avx512bitalg,+rdpru,+clwb,+mmx,+sse2,+rdseed,-avx512vbmi2,-prefetchi,+rdpid,-fma4,-avx512vbmi,+shstk,+vaes,-waitpkg,-sgx,+fxsr,-avx512dq,+sse4a')
On godbolt, I get znver3 too for some attempts. And there -O1 -mcpu=native fails, but -O3 -mcpu=native passes again.
[FWIW, the isolated testcase works because the std.int128 binops are templates, and core.int128 is newly pragma(inline, true) for LDC, so the relevant code is all emitted during compilation with those flags.]
On my laptop's Intel i7-13700H, I haven't managed to make it fail so far.
LLVM 20 also fails (I've used #4911). -O3 is also failing, both x86-64-v3 and native
On an AMD Ryzen 3960X (Zen v2), I can reproduce the failures with prebuilt v1.41.0-beta1 (FWIW, on Ubuntu 24, same as my laptop), with both x86-64-v3 and native.