string literal storage optimization results in wrong-code
clang++ 15.0.0
compiler options: -std=c++20 -O2 -pedantic-errors
Consider the following code (this has expected behavior):
constexpr const char* ptr = "foo";
const char arr1[] = "foo";
const char arr2[] = "foo";
bool foo() {
return +arr1 == ptr; // returns true
}
bool bar() {
return +arr2 == ptr; // returns false
}
So the string literal object referred by ptr overlaps with arr1. Unsurprisingly it can't overlap with arr2 at the same time, since it would mean that arr1 and arr2 overlap, but that wouldn't be conforming.
However if similar array initialisations and pointer comparisons are done in different translation units then we can get contradictory behavior.
// header.h
inline constexpr const char* ptr = "foo"
extern const char arr1[];
extern const char arr2[];
bool compare_arr1_to_ptr();
bool compare_arr2_to_ptr();
//arr1.cpp
const char arr1[] = "foo";
bool compare_arr1_to_ptr() {
return +arr1 == ptr; // returns true
}
//arr2.cpp
const char arr2[] = "foo";
bool compare_arr2_to_ptr() {
return +arr2 == ptr; // returns true
}
//main.cpp
#include <iostream>
int main() {
std::cout << std::boolalpha << '\n';
std::cout << compare_arr1_to_ptr() << '\n'; //true
std::cout << compare_arr1_to_ptr() << '\n'; //true
std::cout << (+arr1 == ptr) << '\n'; //false
std::cout << (+arr2 == ptr) << '\n'; //false
std::cout << (+arr1 == +arr2) << '\n'; //false
}
https://godbolt.org/z/7K8eqMqrf
None of the pointer comparisons in the program are unspecified as per [expr.eq].
It is tempting to apply the following wording in [lex.string]:
...whether successive evaluations of a string-literal yield the same or a different object is unspecified.
But the string literal is evaluated exactly once, at the initialization of ptr. Further evaluations of ptr itself refers to the same object.
These out of the way, we have the following non-conforming behaviors of the executed program:
-
compare_arr1_to_ptr()and(+arr1 == ptr)evaluate to different values. But+arr1either represents the same address ofptror not. The pointers have the same values in the two evaluations. The same applies to the comparisons involvingarr2. - As both
compare_arr1_to_ptr()andcompare_arr2_to_ptr()evaluate totrue, therefore+arr1represents the same address asptr, andptrrepresents the same address as+arr2. Butarr1andarr2are distinct objects with disjoint bytes of storage [basic.memobj].
The root issue seems to be that ptr gets a different definition in different translation units. However the definition of ptr does not violate the one-definition-rule, therefore this implementation is non-conforming.
[CWG1823] was intended as an escape-hatch for inline functions, but inline functions are not fully exempt from this problem.
Consider:
inline const char* foo() {
static constexpr const char* ptr = "foo";
return ptr;
}
or:
constexpr const char* foo() {
return "foo"; // allowed to return pointers to different objects at different _evaluations_ (not instantiations)
}
inline constexpr const char* ptr = foo(); // foo() is evaluated once to initialize ptr
Note that the problem doesn't appear at O0 so it's likely an optimization bug. Any insight @zygoloid ?
@llvm/issue-subscribers-clang-codegen
This is primarily an ABI issue -- we need some way for the value of ptr emitted in both translation units to reliably be the same at link time. Eg, if we have:
// TU 1
inline constexpr const char *ptr = "foo";
const char *ptrA = ptr;
// TU 2
inline constexpr const char *ptr = "foo";
const char *ptrB = ptr;
... then ptrA and ptrB need to be statically initialized to the same value, which is not something we can do within the current ABI rules.
The corresponding ABI issue is https://github.com/itanium-cxx-abi/cxx-abi/issues/78
I have been discussing this issue with @leni536 and I'm of the view that a non-string-literal object cannot overlap a string literal object, and thus in the first code snippet, foo and bar should both always return false, since ptr points to a string literal object, while arr1 and arr2 are non-string-literal objects.
What gave rise to the Clang approach here? A belief that the standard intentionally gives the freedom to make string literal objects overlap non-string-literal objects? Or just the fact that the wording is not crystal clear in forbidding it?
Maybe string literal symbol mangling can resolve part of the issue, however note that for the following TU there is no string literal symbol emitted at all:
inline constexpr const char* ptr = "foo";
extern const char arr[] = "foo";
bool foo() {
return ptr == arr;
}
auto get_ptr() {
return ptr;
}
auto get_arr() {
return arr;
}
This results in the following codegen when compiled on clang 16, with -std=c++20 -O1 -pedantic-errors:
foo(): # @foo()
mov al, 1
ret
get_ptr(): # @get_ptr()
lea rax, [rip + arr]
ret
get_arr(): # @get_arr()
lea rax, [rip + arr]
ret
arr:
.asciz "foo"
https://godbolt.org/z/j7rbx61zY
Some observations:
- In
foo()the pointer comparison gets optimized totrue. - In
get_ptr()the reference to the string literal gets replaced with a reference to the array.
So even if the ABI of string literals gets reworked, the optimization also needs to be reworked as it becomes unknown at compile time whether the string literal ends up sharing storage with the array. I'm not sure if it's worth replacing foo()'s body with a runtime check instead of leaving the optimization and optimizing it to return false. So there is a trade off here.
Note that in the original example with 3 TUs all three of ptr, arr1 and arr2 can't refer to the same object. Something needs to give.
I have been discussing this issue with @leni536 and I'm of the view that a non-string-literal object cannot overlap a string literal object, and thus in the first code snippet,
fooandbarshould both always return false, sinceptrpoints to a string literal object, whilearr1andarr2are non-string-literal objects.
Do you think the standard is clear that this is not allowed? The wording around string literal objects is vague (and the previous wording before we introduced string literal objects was even more vague). I think it'd be a good idea to file a core issue on this subject if there isn't one already.
What gave rise to the Clang approach here? A belief that the standard intentionally gives the freedom to make string literal objects overlap non-string-literal objects? Or just the fact that the wording is not crystal clear in forbidding it?
The only tool that LLVM gives us to perform merging of string literals is the unnamed_addr attribute, which permits merging a constant with that attribute into another constant with the same value (regardless of the properties of that other constant). I don't know whether the possibility of merging a string literal with some other constant part of the program was explicitly considered and thought to be OK, or simply not considered.
Maybe string literal symbol mangling can resolve part of the issue, however note that for the following TU there is no string literal symbol emitted at all:
There is, but it got optimized away. If you look at the LLVM IR prior to optimization you can see the @.str internal symbol that Clang emits.