folly
folly copied to clipboard
ConcurrentHashMap allocates unaligned memory for SegmentT which alignas 64
Bugs
ConcurrentHashMap allocates memory by std::allocator<uint8_t>, which won't generate memory aligned at 64B.
This has caused coredump when I compiled with clang14 and -mavx2:
Clang14

Clang14 IR

In clang14, mutex will view as aligned at 64B, and vmovaps instruction will be used to optimize.
reproduce codes
I've copied some codes from folly to reproduce this bugs. folly version is v2018.08.20.00. This example won't coredump, but we can found it uses vmovaps from objdump and the reason from clang IR.
I'm not sure it is still a problem in main branch.
# clang IR
/path/to/clang++14 -S -emit-llvm -std=c++14 -O3 -mavx2 main.cpp
#include <atomic>
#include <mutex>
#include <string>
#include <iostream>
#include <stdio.h>
template <
typename KeyType,
typename ValueType,
uint8_t ShardBits = 8,
typename HashFn = std::hash<KeyType>,
typename KeyEqual = std::equal_to<KeyType>,
typename Allocator = std::allocator<uint8_t>,
template <typename> class Atom = std::atomic,
class Mutex = std::mutex>
class alignas(64) ConcurrentHashMapSegment {
public:
ConcurrentHashMapSegment(
size_t initial_buckets,
float load_factor,
size_t max_size)
: load_factor_(load_factor), max_size_(max_size) {
std::cout << "initial_buckets: " << initial_buckets << std::endl;
}
private:
Mutex m_;
float load_factor_;
size_t const max_size_;
};
int main() {
using SegmentT = ConcurrentHashMapSegment<std::string, int64_t>;
using Allocator = std::allocator<uint8_t>;
SegmentT* newseg = (SegmentT*)Allocator().allocate(sizeof(SegmentT));
// SegmentT* newseg = new SegmentT(16, 1.0, 16);
printf("%p\n", newseg);
newseg = new (newseg)
SegmentT(16, 1.0, 16);
}