Thread safety race condition and proposed fix
As was decided in a meeting, before introducing changes that improve multi-threaded scaling we are to improve thread safety. This PR adds fixes for the following data races:
- Accessing
tags_andtag_groups_insta::Searchwhile they're being reallocated in another thread. It's not enough to delay the swap, as reading from an array is two operations on x86 (load the pointer, then load the array element). The array may get swapped out and freed in between those. - Accessing and modifying bit fields from multiple threads. It's not possible to write individual bits to RAM, the best you can do is read a byte, do a bit operation on it, and write it back. Moreover, compilers can (and do) read up to 8 bytes, do bit operations on that, and write the whole 8 bytes back. If between reading and writing a field we're not even accessing (seemingly) changes, then the write-back overwrites that change.
- Indexing and inserting in an
ObjectTableorArrayTableinsta::Graphfrom multiple threads. - A race between
PathGroup::savablewhich accessesPathEnds andPathGroup::prune()which deletesPathEnds.
These changes incur a significant slowdown (about 25%).
Note that these are only the races found by ThreadSanitizer, not necessarily all of them.
Here's a thread sanitizer log of BlackParrot GRT on master
Example warning:
WARNING: ThreadSanitizer: data race (pid=709934)
Write of size 4 at 0x7b880057571c by thread T6 (mutexes: write M0):
#0 sta::Vertex::setBfsInQueue(sta::BfsIndex, bool) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/graph/Graph.cc:1375 (openroad+0xe6c885) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#1 sta::BfsIterator::enqueue(sta::Vertex*) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:261 (openroad+0xf5beeb) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#2 sta::BfsFwdIterator::enqueueAdjacentVertices(sta::Vertex*, sta::SearchPred*, int) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:370 (openroad+0xf5c218) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#3 sta::BfsIterator::enqueueAdjacentVertices(sta::Vertex*) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:120 (openroad+0xf5a873) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#4 sta::GraphDelayCalc::findVertexDelay(sta::Vertex*, sta::ArcDelayCalc*, bool) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/dcalc/GraphDelayCalc.cc:597 (openroad+0xe6113a) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#5 sta::FindVertexDelays::visit(sta::Vertex*) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/dcalc/GraphDelayCalc.cc:229 (openroad+0xe61205) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#6 operator() /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:201 (openroad+0xf597a1) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#7 __invoke_impl<void, sta::BfsIterator::visitParallel(sta::Level, sta::VertexVisitor*)::<lambda(int)>&, int> /usr/include/c++/13.2.1/bits/invoke.h:61 (openroad+0xf597a1)
#8 __invoke_r<void, sta::BfsIterator::visitParallel(sta::Level, sta::VertexVisitor*)::<lambda(int)>&, int> /usr/include/c++/13.2.1/bits/invoke.h:111 (openroad+0xf597a1)
#9 _M_invoke /usr/include/c++/13.2.1/bits/std_function.h:290 (openroad+0xf597a1)
#10 std::function<void (int)>::operator()(int) const /usr/include/c++/13.2.1/bits/std_function.h:591 (openroad+0x102741f) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#11 sta::DispatchQueue::dispatch_thread_handler(unsigned long) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/util/DispatchQueue.cc:100 (openroad+0x102741f)
#12 void std::__invoke_impl<void, void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long>(std::__invoke_memfun_deref, void (sta::DispatchQueue::*&&)(unsigned long), sta::DispatchQueue*&&, unsigned long&&) /usr/include/c++/13.2.1/bits/invoke.h:74 (openroad+0x102844e) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#13 std::__invoke_result<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long>::type std::__invoke<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long>(void (sta::DispatchQueue::*&&)(unsigned long), sta::DispatchQueue*&&, unsigned long&&) /usr/include/c++/13.2.1/bits/invoke.h:96 (openroad+0x102844e)
#14 void std::thread::_Invoker<std::tuple<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long> >::_M_invoke<0ul, 1ul, 2ul>(std::_Index_tuple<0ul, 1ul, 2ul>) /usr/include/c++/13.2.1/bits/std_thread.h:292 (openroad+0x102844e)
#15 std::thread::_Invoker<std::tuple<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long> >::operator()() /usr/include/c++/13.2.1/bits/std_thread.h:299 (openroad+0x102844e)
#16 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long> > >::_M_run() /usr/include/c++/13.2.1/bits/std_thread.h:244 (openroad+0x102844e)
#17 execute_native_thread_routine /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:104 (libstdc++.so.6+0xe1942) (BuildId: 26d15cc010d7e7c8831795d6b4f135e1d1958ee1)
Previous read of size 1 at 0x7b880057571f by thread T9:
#0 sta::Vertex::bfsInQueue(sta::BfsIndex) const /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/graph/Graph.cc:1367 (openroad+0xe6c80e) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#1 sta::BfsIterator::enqueue(sta::Vertex*) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:257 (openroad+0xf5be61) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#2 sta::BfsFwdIterator::enqueueAdjacentVertices(sta::Vertex*, sta::SearchPred*, int) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:370 (openroad+0xf5c218) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#3 sta::BfsIterator::enqueueAdjacentVertices(sta::Vertex*) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:120 (openroad+0xf5a873) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#4 sta::GraphDelayCalc::findVertexDelay(sta::Vertex*, sta::ArcDelayCalc*, bool) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/dcalc/GraphDelayCalc.cc:597 (openroad+0xe6113a) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#5 sta::FindVertexDelays::visit(sta::Vertex*) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/dcalc/GraphDelayCalc.cc:229 (openroad+0xe61205) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#6 operator() /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:201 (openroad+0xf597a1) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#7 __invoke_impl<void, sta::BfsIterator::visitParallel(sta::Level, sta::VertexVisitor*)::<lambda(int)>&, int> /usr/include/c++/13.2.1/bits/invoke.h:61 (openroad+0xf597a1)
#8 __invoke_r<void, sta::BfsIterator::visitParallel(sta::Level, sta::VertexVisitor*)::<lambda(int)>&, int> /usr/include/c++/13.2.1/bits/invoke.h:111 (openroad+0xf597a1)
#9 _M_invoke /usr/include/c++/13.2.1/bits/std_function.h:290 (openroad+0xf597a1)
#10 std::function<void (int)>::operator()(int) const /usr/include/c++/13.2.1/bits/std_function.h:591 (openroad+0x102741f) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#11 sta::DispatchQueue::dispatch_thread_handler(unsigned long) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/util/DispatchQueue.cc:100 (openroad+0x102741f)
#12 void std::__invoke_impl<void, void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long>(std::__invoke_memfun_deref, void (sta::DispatchQueue::*&&)(unsigned long), sta::DispatchQueue*&&, unsigned long&&) /usr/include/c++/13.2.1/bits/invoke.h:74 (openroad+0x102844e) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
#13 std::__invoke_result<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long>::type std::__invoke<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long>(void (sta::DispatchQueue::*&&)(unsigned long), sta::DispatchQueue*&&, unsigned long&&) /usr/include/c++/13.2.1/bits/invoke.h:96 (openroad+0x102844e)
#14 void std::thread::_Invoker<std::tuple<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long> >::_M_invoke<0ul, 1ul, 2ul>(std::_Index_tuple<0ul, 1ul, 2ul>) /usr/include/c++/13.2.1/bits/std_thread.h:292 (openroad+0x102844e)
#15 std::thread::_Invoker<std::tuple<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long> >::operator()() /usr/include/c++/13.2.1/bits/std_thread.h:299 (openroad+0x102844e)
#16 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long> > >::_M_run() /usr/include/c++/13.2.1/bits/std_thread.h:244 (openroad+0x102844e)
#17 execute_native_thread_routine /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:104 (libstdc++.so.6+0xe1942) (BuildId: 26d15cc010d7e7c8831795d6b4f135e1d1958ee1)
This one has been fixed in OpenSTA. We are waiting until we uprev the next version then we will close this.
Issues or PRs should be filed with https://github.com/parallaxsw/OpenSTA if still relevant. This is effectively a fork (though not strictly for historical reasons).