abseil-cpp
abseil-cpp copied to clipboard
Use of uninitialized memory due to alias violation in (flat) map/set
Describe the bug
In short: The assumption stated at https://github.com/abseil/abseil-cpp/blob/311bbd2e50ea35e921a08186840d3b6ca279e880/absl/container/internal/container_memory.h#L322-L325 that accessing an inactive union member is allowed does not hold for at least GCC. This leads to reads of uninitialized memory which manifests in actual bugs and buffer-overruns (out-of-bounds access) in e.g. TensorFlow 2.4.1 with GCC 8.3-8.5. See https://github.com/tensorflow/tensorflow/issues/47179
The only solution is to remove this use of an union and in general do not alias classes in unions. It might be possible to work around this "mostly" with the may_alias
GCC attribute, but that is also brittle and potentially inefficient.
Detailed description
The following code (simplified from TF) may read uninitialized memory and may place bogus values into the idx_vec (see the linked TF issue):
absl::flat_hash_map<absl::string_view, int32_t> uniq;
Tensor<string> Tin(N) = ...;
Tensor<int32_t> idx_vec out(N);
for (Eigen::Index i = 0, j = 0; i < N; ++i) {
auto it = uniq.emplace(Tin(i), j);
idx_vec(i) = it.first->second; // Invalid read here
if (it.second) {
++j;
}
}
The cause can be seen when removing all the abstractions happening inside the emplace and iterator dereferencing which boils down to:
// Preparation of flat_map
char* buffer = malloc(...);
slot_type* slot_buffer = reinterpret_cast<slot_type*>(buffer);
// on emplace:
size_t x = ...;
slot_type* slot = slot_buffer + x;
new(slot) slot_type;
new(&slot->mutable_value) pair<K, V>(k, v);
slot_type* slot2 = slot_buffer + x; // same as before, actually done through the iterator
const pair<const K, V>& element = slot->value; // https://github.com/abseil/abseil-cpp/blob/311bbd2e50ea35e921a08186840d3b6ca279e880/absl/container/internal/container_memory.h#L357
idx_vec[i] = element.second;
Not that slot_type
is this union: https://github.com/abseil/abseil-cpp/blob/311bbd2e50ea35e921a08186840d3b6ca279e880/absl/container/internal/container_memory.h#L326-L337
Now the problem is that GCC does not implement "common initial sequence" exception referred to in https://timsong-cpp.github.io/cppwp/n3337/class.mem#19 because it has major negative implications, see e.g. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892 for some discussion on this topic. GCC however does allow union-based aliasing if (and only if) accessed through the union, i.e. the following is explicitly supported by GCC:
union {
struct{ int x; } x;
struct{ int y; int z;} y;
} u;
u.x.x = 1;
return u.y.y;
What is not supported is taking pointers or accesses not involving the union:
union {
struct{ int x; } x;
struct{ int y; int z;} y;
} u;
auto& x = u.x;
auto& y = u.y;
x.x = 1;
return y.y;
Plausible reasoning: Type based alias analysis sees different types for x and y and hence assumes those do not alias.
This is what happens in the case of of the slot-type or in general with member functions:
new(&slot->mutable_value) pair<K, V>(k, v);
is "translated" to:
pair<K, V>* p = &slot->mutable_value; // no-op placement new
p->pair::pair(k, v); // ctor call which in turn is expanded to:
this->first = k;
this->second = v;
As you can see the "access through union" rule gets violated in 2 places: a pointer to an (aliased) member is created, the pointer is used in a (special) member function (here: constructor). In both cases the "reference" to the union is lost and GCC assumes this pointer does not alias a pointer to another type.
So coming back to the Abseil code:
new(&slot->mutable_value) pair<K, V>(k, v); // Uses a pointer of type: pair<K, V>
slot_type* slot = slot_buffer + x; // same as before, actually done through the iterator
const pair<const K, V>& element = slot->value;
idx_vec[i] = element.second; // Uses a pointer/reference of type pair<const K, V>
Hence the compiler assumes at some point that the write of the ctor and the read of the iterator are independent and schedules the load of the latter before the former for increased performance which in fact reads the (initially) uninitialized memory allocated at https://github.com/abseil/abseil-cpp/blob/master/absl/container/internal/raw_hash_set.h#L1554-L1555
See also my discussion with a GCC developer at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892
Final note: Again this is not something hypothetical but comes from an actual bug created when compiling otherwise valid code with GCC 8 for skylake/icelake. Things like -fno-strict-aliasing
make it disappear as do changes to the code that seemingly affect instruction scheduling. E.g. reading the iterator a second time returns the correct value, but that is only be coincidence.
Thanks for a thorough report. @sbenzaquen, can you take a look at this one?
Just to update.
It seems that the standard libraries use const_cast
where necessary to cast away the constness of keys when implementing map/unordered_map. At least libstc++ and libc++ do. Doing so is UB, but the standard library can do it.
Still investigating options.
Haven't seen them using const_cast
for that yet and I'm not sure for what a mutable key is actually needed as they use node-based structures.
For comparison: The Boost flat maps use std::pair<K, V>
for that and simply document changing the key as an invalid operation.
std needs mutable access to the key to implement node_handle::key()
. They do it through const_cast
.