llvm-project clang++: Optimization passes can strip away deadlock loops

Consider the following code

#include <iostream>

int main()
{
  std::cout << "dead lock\n";
  for (;;) {}
}

Compiling the following example with LLVM's C++ front-end, with any optimization level (except -O0) will cause program to segfault. After looking at the assembly output, it seems like optimization passes has stripped away the loop. Compiling this same code with GCC, it correctly kept the loop in every possible optimization level.

This is the generated assembly code by Clang (truncated for brevity):

main:                                   # @main
	pushq	%rax
	movq	_ZSt4cout@GOTPCREL(%rip), %rdi
	leaq	.L.str(%rip), %rsi
	movl	$10, %edx
	callq	_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l@PLT
	pushq	%rbx
	leaq	_ZStL8__ioinit(%rip), %rbx
	movq	%rbx, %rdi
	callq	_ZNSt8ios_base4InitC1Ev@PLT
	movq	_ZNSt8ios_base4InitD1Ev@GOTPCREL(%rip), %rdi
	leaq	__dso_handle(%rip), %rdx
	movq	%rbx, %rsi
	popq	%rbx
	jmp	__cxa_atexit@PLT                # TAILCALL

And this one by GCC (truncated for brevity):

main:
.LFB1782:
	subq	$8, %rsp
	movl	$10, %edx
	leaq	.LC0(%rip), %rsi
	leaq	_ZSt4cout(%rip), %rdi
	call	_ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l@PLT
.L2:
	jmp	.L2

_GLOBAL__sub_I_main:
.LFB2310:
	pushq	%rbx
	leaq	_ZStL8__ioinit(%rip), %rbx
	movq	%rbx, %rdi
	call	_ZNSt8ios_base4InitC1Ev@PLT
	movq	_ZNSt8ios_base4InitD1Ev@GOTPCREL(%rip), %rdi
	movq	%rbx, %rsi
	popq	%rbx
	leaq	__dso_handle(%rip), %rdx
	jmp	__cxa_atexit@PLT

Notice that Clang optimized away the loop, which GCC (jmp .L2) kept it. This seems like a bug to me in the pattern matching. However, creating the same example in C, and compiling with LLVM's C front-end (clang), did worked out. It seems only the C++ one doing this stripping.

Godbolt: https://godbolt.org/z/csnjMjo7n

Mar 22 '24 18:03 rilysh

Infinite loops are undefined behavior in C++ so LLVM happily deletes them. You can disable this behavior by passing -fno-finite-loops to clang. ARM's documentation suggests putting an empty volatile inline asm in the loop https://developer.arm.com/documentation/dui0773/d/Coding-Considerations/Infinite-Loops

There are various threads on this topic in the llvm community. For example https://github.com/llvm/llvm-project/issues/60622

Mar 22 '24 19:03 topperc

Note the above loop is becoming well defined in future versions of C++ though: https://isocpp.org/files/papers/P2809R3.html

GCC just happens to follow that paper already while it looks like clang does not (note for the behavior for older C++ standard both compilers results are valid behavior due to the undefinedness).

Mar 22 '24 21:03 pinskia

Confirmed this is now well-defined behavior.

Mar 23 '24 02:03 shafik

Thanks for the helpful comments! For portability, I'll go with ARM's recommendations, maybe till P2809R3 goes anywhere acceptable to LLVM.

Mar 24 '24 05:03 rilysh

llvm-project llvm-project copied to clipboard

clang++: Optimization passes can strip away deadlock loops

llvm-project
llvm-project copied to clipboard