abseil-cpp icon indicating copy to clipboard operation
abseil-cpp copied to clipboard

Use of uninitialized memory due to alias violation in (flat) map/set

Open Flamefire opened this issue 3 years ago • 4 comments

Describe the bug

In short: The assumption stated at https://github.com/abseil/abseil-cpp/blob/311bbd2e50ea35e921a08186840d3b6ca279e880/absl/container/internal/container_memory.h#L322-L325 that accessing an inactive union member is allowed does not hold for at least GCC. This leads to reads of uninitialized memory which manifests in actual bugs and buffer-overruns (out-of-bounds access) in e.g. TensorFlow 2.4.1 with GCC 8.3-8.5. See https://github.com/tensorflow/tensorflow/issues/47179

The only solution is to remove this use of an union and in general do not alias classes in unions. It might be possible to work around this "mostly" with the may_alias GCC attribute, but that is also brittle and potentially inefficient.

Detailed description

The following code (simplified from TF) may read uninitialized memory and may place bogus values into the idx_vec (see the linked TF issue):

      absl::flat_hash_map<absl::string_view, int32_t> uniq;
      Tensor<string> Tin(N) = ...;
      Tensor<int32_t> idx_vec out(N);
      for (Eigen::Index i = 0, j = 0; i < N; ++i) {
        auto it = uniq.emplace(Tin(i), j);
        idx_vec(i) = it.first->second; // Invalid read here
        if (it.second) {
          ++j;
        }
      }

The cause can be seen when removing all the abstractions happening inside the emplace and iterator dereferencing which boils down to:

// Preparation of flat_map
char* buffer = malloc(...);
slot_type* slot_buffer = reinterpret_cast<slot_type*>(buffer);

// on emplace:
size_t x = ...;
slot_type* slot = slot_buffer + x;
new(slot) slot_type;
new(&slot->mutable_value) pair<K, V>(k, v);
slot_type* slot2 = slot_buffer + x; // same as before, actually done through the iterator
const pair<const K, V>& element = slot->value; // https://github.com/abseil/abseil-cpp/blob/311bbd2e50ea35e921a08186840d3b6ca279e880/absl/container/internal/container_memory.h#L357
idx_vec[i] = element.second;

Not that slot_type is this union: https://github.com/abseil/abseil-cpp/blob/311bbd2e50ea35e921a08186840d3b6ca279e880/absl/container/internal/container_memory.h#L326-L337

Now the problem is that GCC does not implement "common initial sequence" exception referred to in https://timsong-cpp.github.io/cppwp/n3337/class.mem#19 because it has major negative implications, see e.g. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892 for some discussion on this topic. GCC however does allow union-based aliasing if (and only if) accessed through the union, i.e. the following is explicitly supported by GCC:

union {
  struct{ int x; } x;
  struct{ int y; int z;} y;
} u;
u.x.x = 1;
return u.y.y;

What is not supported is taking pointers or accesses not involving the union:

union {
  struct{ int x; } x;
  struct{ int y; int z;} y;
} u;
auto& x = u.x;
auto& y = u.y;
x.x = 1;
return y.y;

Plausible reasoning: Type based alias analysis sees different types for x and y and hence assumes those do not alias.

This is what happens in the case of of the slot-type or in general with member functions: new(&slot->mutable_value) pair<K, V>(k, v); is "translated" to:

pair<K, V>* p = &slot->mutable_value; // no-op placement new
p->pair::pair(k, v); // ctor call which in turn is expanded to: 
this->first = k;
this->second = v;

As you can see the "access through union" rule gets violated in 2 places: a pointer to an (aliased) member is created, the pointer is used in a (special) member function (here: constructor). In both cases the "reference" to the union is lost and GCC assumes this pointer does not alias a pointer to another type.

So coming back to the Abseil code:

new(&slot->mutable_value) pair<K, V>(k, v); // Uses a pointer of type: pair<K, V>
slot_type* slot = slot_buffer + x; // same as before, actually done through the iterator
const pair<const K, V>& element = slot->value;
idx_vec[i] = element.second; // Uses a pointer/reference of type pair<const K, V>

Hence the compiler assumes at some point that the write of the ctor and the read of the iterator are independent and schedules the load of the latter before the former for increased performance which in fact reads the (initially) uninitialized memory allocated at https://github.com/abseil/abseil-cpp/blob/master/absl/container/internal/raw_hash_set.h#L1554-L1555

See also my discussion with a GCC developer at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65892

Final note: Again this is not something hypothetical but comes from an actual bug created when compiling otherwise valid code with GCC 8 for skylake/icelake. Things like -fno-strict-aliasing make it disappear as do changes to the code that seemingly affect instruction scheduling. E.g. reading the iterator a second time returns the correct value, but that is only be coincidence.

Flamefire avatar Jun 16 '21 12:06 Flamefire

Thanks for a thorough report. @sbenzaquen, can you take a look at this one?

derekmauro avatar Jun 16 '21 13:06 derekmauro

Just to update. It seems that the standard libraries use const_cast where necessary to cast away the constness of keys when implementing map/unordered_map. At least libstc++ and libc++ do. Doing so is UB, but the standard library can do it. Still investigating options.

sbenzaquen avatar Jun 18 '21 19:06 sbenzaquen

Haven't seen them using const_cast for that yet and I'm not sure for what a mutable key is actually needed as they use node-based structures.

For comparison: The Boost flat maps use std::pair<K, V> for that and simply document changing the key as an invalid operation.

Flamefire avatar Jun 21 '21 07:06 Flamefire

std needs mutable access to the key to implement node_handle::key(). They do it through const_cast.

sbenzaquen avatar Jun 21 '21 14:06 sbenzaquen