test_malloc abort on sparc
Source: master (69e188067) Host: Linux 5.18.0-3-sparc64-smp (Debian) Compiler: clang 13.0.1 How to build: CC=clang ./configure --enable-assertions --disable-atomic-intrinsics && make -j check Fail rate: ~3/4 Note: cannot reproduce with gcc-12
Output 1 (test_malloc): Performing 1000 reversals of 1000 element lists in 16 threads Testing AO_malloc/AO_free Aborted
Output 2 (test_malloc): Performing 1000 reversals of 1000 element lists in 16 threads Testing AO_malloc/AO_free Segmentation fault
Some related issue: #44 (a SIGSEGV in test_malloc on some other arch)
As of source: master (620ae9dd2) How to build: CC=clang ./configure --enable-assertions && make -j check CFLAGS_EXTRA="-D AO_DISABLE_GCC_ATOMICS" Observed at least on gcc102 machine (gcc farm).
Not reproduced: if CFLAGS_EXTRA="-D AO_USE_ALMOST_LOCK_FREE" or CFLAGS_EXTRA="-D AO_DISABLE_GCC_ATOMICS -D AO_NO_SPARC_V9" or CFLAGS_EXTRA="-D AO_DISABLE_GCC_ATOMICS -D AO_GENERALIZE_ASM_BOOL_CAS"
Changing code in AO_stack_pop_explicit_aux_acquire works around the issue: if (AO_EXPECT_FALSE(!AO_compare_and_swap_release(list, first, next))) -> if (AO_EXPECT_FALSE(first!=AO_fetch_compare_and_swap_release(list, first, next)))
Asm code (original):
.LBB5_14:
and %i4, -8, %g3
ldx [%g3], %g4
mov %i4, %g5
!APP
membar #StoreLoad | #LoadLoad
casx [%i0],%g5,%g4
membar #StoreLoad | #StoreStore
cmp %g5,%g4
be,a 0f
mov 1,%g5
clr %g5
0:
!NO_APP
cmp %g5, 0
bne .LBB5_16
nop
!APP
Asm code after the test W/A:
.LBB5_14:
and %i4, -8, %g3
ldx [%g3], %g4
!APP
membar #StoreLoad | #LoadLoad
casx [%i0],%i4,%g4
membar #StoreLoad | #StoreStore
!NO_APP
cmp %i4, %g4
be %xcc, .LBB5_16
nop
ba .LBB5_15
nop
.LBB5_15:
(The difference is that the 2nd variant does not use %g5.)
Hello @kernigh and @hboehm, If you have any insight about the root cause of this failure, please let me know. For now I'm going to apply a workaround by simplifying asm code in AO_compare_and_swap_full (move comparison of old val and CAS result to C level).