llvm-project
llvm-project copied to clipboard
clang++: Optimization passes can strip away deadlock loops
Consider the following code
#include <iostream>
int main()
{
std::cout << "dead lock\n";
for (;;) {}
}
Compiling the following example with LLVM's C++ front-end, with any optimization level (except -O0) will cause program to segfault. After looking at the assembly output, it seems like optimization passes has stripped away the loop. Compiling this same code with GCC, it correctly kept the loop in every possible optimization level.
This is the generated assembly code by Clang (truncated for brevity):
main: # @main
pushq %rax
movq _ZSt4cout@GOTPCREL(%rip), %rdi
leaq .L.str(%rip), %rsi
movl $10, %edx
callq _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l@PLT
pushq %rbx
leaq _ZStL8__ioinit(%rip), %rbx
movq %rbx, %rdi
callq _ZNSt8ios_base4InitC1Ev@PLT
movq _ZNSt8ios_base4InitD1Ev@GOTPCREL(%rip), %rdi
leaq __dso_handle(%rip), %rdx
movq %rbx, %rsi
popq %rbx
jmp __cxa_atexit@PLT # TAILCALL
And this one by GCC (truncated for brevity):
main:
.LFB1782:
subq $8, %rsp
movl $10, %edx
leaq .LC0(%rip), %rsi
leaq _ZSt4cout(%rip), %rdi
call _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l@PLT
.L2:
jmp .L2
_GLOBAL__sub_I_main:
.LFB2310:
pushq %rbx
leaq _ZStL8__ioinit(%rip), %rbx
movq %rbx, %rdi
call _ZNSt8ios_base4InitC1Ev@PLT
movq _ZNSt8ios_base4InitD1Ev@GOTPCREL(%rip), %rdi
movq %rbx, %rsi
popq %rbx
leaq __dso_handle(%rip), %rdx
jmp __cxa_atexit@PLT
Notice that Clang optimized away the loop, which GCC (jmp .L2) kept it. This seems like a bug to me in the pattern matching. However, creating the same example in C, and compiling with LLVM's C front-end (clang), did worked out. It seems only the C++ one doing this stripping.
Godbolt: https://godbolt.org/z/csnjMjo7n
Infinite loops are undefined behavior in C++ so LLVM happily deletes them. You can disable this behavior by passing -fno-finite-loops to clang. ARM's documentation suggests putting an empty volatile inline asm in the loop https://developer.arm.com/documentation/dui0773/d/Coding-Considerations/Infinite-Loops
There are various threads on this topic in the llvm community. For example https://github.com/llvm/llvm-project/issues/60622
Note the above loop is becoming well defined in future versions of C++ though: https://isocpp.org/files/papers/P2809R3.html
GCC just happens to follow that paper already while it looks like clang does not (note for the behavior for older C++ standard both compilers results are valid behavior due to the undefinedness).
Confirmed this is now well-defined behavior.
Thanks for the helpful comments! For portability, I'll go with ARM's recommendations, maybe till P2809R3 goes anywhere acceptable to LLVM.