多线程调用GlobalData::RegisterService会出现coredump,asan能查出heap-use-after-free
Describe the bug 多线程调用GlobalData::RegisterService,且service填同一个,会出现coredump,必现。如下方式:
std::vector<std::thread> threads;
for (int i = 0; i < 20; ++i) {
threads.emplace_back([]() {
while(1) {
apollo::cyber::common::GlobalData::RegisterService("testminisim/test_register_msg");
usleep(100);
}
});
}
To Reproduce Steps to reproduce the behavior:
- Go to '...'
- Click on '....'
- Scroll down to '....'
- See error
Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: [e.g. iOS]
- Browser [e.g. chrome, safari]
- Version [e.g. 22]
Smartphone (please complete the following information):
- Device: [e.g. iPhone6]
- OS: [e.g. iOS8.1]
- Browser [e.g. stock browser, safari]
- Version [e.g. 22]
Additional context Add any other context about the problem here.
知道原因了。GlobalData::RegisterService写的就有问题
多线程调用RegisterService的时候,如果service填的还是同一个,出现踩内存的场景,虽然service_id_map_是线程安全的,但是只是Get和Set接口线程安全,在通过Get拿到name后,这个name填的是二级指针,所以name本身就是service_id_map_的value,另一个线程可能会再去修改这个value,造成Get出来的name 会有多线程同时读写的风险,用我上面贴的代码去复现,基本必现。
Will check and feedback soon. I tested it and it will indeed report an error.
Will check and feedback soon. I tested it and it will indeed report an error.
修改方案:
uint64_t GlobalData::RegisterService(const std::string& service) {
auto id = Hash(service);
std::lock_guard<std::mutex> lock(service_id_map_mutex_);
while (service_id_map_.Has(id)) {
std::string* name = nullptr;
service_id_map_.Get(id, &name);
if (service == *name) {
break;
}
++id;
AWARN << "Service name hash collision: " << service << " <=> " << *name;
}
service_id_map_.Set(id, service);
return id;
}