ldc icon indicating copy to clipboard operation
ldc copied to clipboard

`std.int128` unittest failure with `-O1 -mcpu=x86-64-v3` since #4892

Open the-horo opened this issue 8 months ago • 10 comments

1.41.0-beta1 is failing one of its unittests for me:

$ ctest -V -R std.int128
UpdateCTestConfiguration  from :/home/happy/tmp/ldc-build/DartConfiguration.tcl
UpdateCTestConfiguration  from :/home/happy/tmp/ldc-build/DartConfiguration.tcl
Test project /home/happy/tmp/ldc-build
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 370
    Start 370: std.int128-shared

370: Test command: /home/happy/tmp/ldc-build/runtime/phobos2-test-runner-shared "std.int128"
370: Working Directory: /home/happy/tmp/ldc-build/runtime
370: Test timeout computed to be: 10000000
370: ****** FAIL release64 std.int128
370: core.exception.AssertError@std/int128.d(574): Int128(Cent(14, 0)) != Int128(Cent(15, 0))
370: ----------------
370: ??:? _d_assert_msg [0x7f703c761383]
370: ??:? pure nothrow @nogc @safe void std.int128.__unittest_L521_C1() [0x7f703d7507c7]
370: ??:? [0x7f703d751a8f]
370: ??:? [0x5632a7c2c697]
370: ??:? [0x5632a7c2c53f]
370: ??:? [0x5632a7c2c451]
370: ??:? runModuleUnitTests [0x7f703c784173]
370: ??:? void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).runAll() [0x7f703c79f78a]
370: ??:? _d_run_main2 [0x7f703c79f5a6]
370: ??:? _d_run_main [0x7f703c79f38c]
370: ??:? [0x7f703c4bd16d]
370: ??:? __libc_start_main [0x7f703c4bd228]
370: ??:? [0x5632a7c2c1c4]
1/2 Test #370: std.int128-shared ................***Failed    0.01 sec
test 813
    Start 813: std.int128-debug-shared

813: Test command: /home/happy/tmp/ldc-build/runtime/phobos2-test-runner-debug-shared "std.int128"
813: Working Directory: /home/happy/tmp/ldc-build/runtime
813: Test timeout computed to be: 10000000
813: 0.000s PASS debug64 std.int128
2/2 Test #813: std.int128-debug-shared ..........   Passed    0.01 sec

The following tests passed:
	std.int128-debug-shared

50% tests passed, 1 tests failed out of 2

Total Test time (real) =   0.03 sec

The following tests FAILED:
	370 - std.int128-shared (Failed)
Errors while running CTest
Output from these tests are in: /home/happy/tmp/ldc-build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

I've configured the project with -DD_FLAGS_RELEASE="-O1;-mcpu=x86-64-v3"

the-horo avatar Apr 27 '25 19:04 the-horo

Reduced:

import std.int128;

void main() {
    auto c = Int128(5, 6);
    c *= Int128(10, 20);
    c /= Int128(10, 20);
    assert(c == Int128(0, 15));
}

Works fine with -O or -mcpu=x86-64-v3 alone, but fails with -O -mcpu=x86-64-v3, at least with LLVM 19 as used by the v1.41.0-beta1 package: https://d.godbolt.org/z/nP93WM6jK. The earlier multiplication is mandatory to keep it failing. - If this isn't an LLVM bug, it might be caused by incomplete inline-asm clobbers in https://github.com/dlang/dmd/blob/5aee9a9eb57d2566eca8c05ebbd9a798bd3645ea/druntime/src/core/int128.d#L640-L646.

kinke avatar Apr 27 '25 20:04 kinke

Seems to work with LLVM 20, at least the isolated test case.

Edit: Hmm not sure, as I can't reproduce the failure on my Intel Raptorlake CPU, when using the prebuilt v1.41.0-beta1 binaries...

kinke avatar Apr 27 '25 21:04 kinke

I'm on an AMD Ryzen 7 5825U if it matters

the-horo avatar Apr 27 '25 21:04 the-horo

LLVM 15 is failing

the-horo avatar Apr 27 '25 21:04 the-horo

The prebuilt packages are failing for me too so the flags are enough to be applied to the test program, not the entire runtime build.

the-horo avatar Apr 27 '25 21:04 the-horo

How's -mcpu=native for you? That made it work again on godbolt (once), but I'm not sure the godbolt runner CPUs are stable.

kinke avatar Apr 27 '25 21:04 kinke

-mcpu=native is what originally made it fail but I tried to trim down the attributes:

$ ldc2 -mcpu=native -vv
Targeting 'x86_64-pc-linux-gnu' (CPU 'znver3' with features '+prfchw,-cldemote,+avx,+aes,+sahf,+pclmul,-xop,+crc32,+xsaves,-avx512fp16,-usermsr,-sm4,-egpr,+sse4.1,-avx512ifma,+xsave,+sse4.2,-tsxldtrk,-sm3,-ptwrite,-widekl,+invpcid,+64bit,+xsavec,-avx10.1-512,-avx512vpopcntdq,+cmov,-avx512vp2intersect,-avx512cd,+movbe,-avxvnniint8,-ccmp,-amx-int8,-kl,-avx10.1-256,-sha512,-avxvnni,-rtm,+adx,+avx2,-hreset,-movdiri,-serialize,+vpclmulqdq,-avx512vl,-uintr,-cf,+clflushopt,-raoint,-cmpccxadd,+bmi,-amx-tile,+sse,-gfni,-avxvnniint16,-amx-fp16,-ndd,+xsaveopt,+rdrnd,-avx512f,-amx-bf16,-avx512bf16,-avx512vnni,-push2pop2,+cx8,-avx512bw,+sse3,+pku,+fsgsbase,+clzero,+mwaitx,-lwp,+lzcnt,+sha,-movdir64b,-ppx,+wbnoinvd,-enqcmd,-avxneconvert,-tbm,-pconfig,-amx-complex,+ssse3,+cx16,+bmi2,+fma,+popcnt,-avxifma,+f16c,-avx512bitalg,+rdpru,+clwb,+mmx,+sse2,+rdseed,-avx512vbmi2,-prefetchi,+rdpid,-fma4,-avx512vbmi,+shstk,+vaes,-waitpkg,-sgx,+fxsr,-avx512dq,+sse4a')

the-horo avatar Apr 27 '25 22:04 the-horo

On godbolt, I get znver3 too for some attempts. And there -O1 -mcpu=native fails, but -O3 -mcpu=native passes again.

[FWIW, the isolated testcase works because the std.int128 binops are templates, and core.int128 is newly pragma(inline, true) for LDC, so the relevant code is all emitted during compilation with those flags.]

On my laptop's Intel i7-13700H, I haven't managed to make it fail so far.

kinke avatar Apr 27 '25 22:04 kinke

LLVM 20 also fails (I've used #4911). -O3 is also failing, both x86-64-v3 and native

the-horo avatar Apr 27 '25 22:04 the-horo

On an AMD Ryzen 3960X (Zen v2), I can reproduce the failures with prebuilt v1.41.0-beta1 (FWIW, on Ubuntu 24, same as my laptop), with both x86-64-v3 and native.

kinke avatar Apr 29 '25 13:04 kinke