folly icon indicating copy to clipboard operation
folly copied to clipboard

ConcurrentHashMap allocates unaligned memory for SegmentT which alignas 64

Open xiegx94 opened this issue 3 years ago • 3 comments

Bugs

ConcurrentHashMap allocates memory by std::allocator<uint8_t>, which won't generate memory aligned at 64B. This has caused coredump when I compiled with clang14 and -mavx2:

Clang14 image

Clang14 IR image

In clang14, mutex will view as aligned at 64B, and vmovaps instruction will be used to optimize.

reproduce codes

I've copied some codes from folly to reproduce this bugs. folly version is v2018.08.20.00. This example won't coredump, but we can found it uses vmovaps from objdump and the reason from clang IR.

I'm not sure it is still a problem in main branch.

# clang IR
/path/to/clang++14 -S -emit-llvm -std=c++14 -O3 -mavx2 main.cpp
#include <atomic>
#include <mutex>
#include <string>
#include <iostream>
#include <stdio.h>
 
template <
    typename KeyType,
    typename ValueType,
    uint8_t ShardBits = 8,
    typename HashFn = std::hash<KeyType>,
    typename KeyEqual = std::equal_to<KeyType>,
    typename Allocator = std::allocator<uint8_t>,
    template <typename> class Atom = std::atomic,
    class Mutex = std::mutex>
class alignas(64) ConcurrentHashMapSegment {
 
public:
  ConcurrentHashMapSegment(
      size_t initial_buckets,
      float load_factor,
      size_t max_size)
      : load_factor_(load_factor), max_size_(max_size) {
    std::cout << "initial_buckets: " << initial_buckets << std::endl;
  }
 
private:
  Mutex m_;
  float load_factor_;
  size_t const max_size_;
};
 
int main() {
 
  using SegmentT = ConcurrentHashMapSegment<std::string, int64_t>;
  using Allocator = std::allocator<uint8_t>;
  
  SegmentT* newseg = (SegmentT*)Allocator().allocate(sizeof(SegmentT));
  // SegmentT* newseg = new SegmentT(16, 1.0, 16);
  printf("%p\n", newseg);
  newseg = new (newseg)
    SegmentT(16, 1.0, 16);
  
}

xiegx94 avatar Aug 24 '22 05:08 xiegx94