OpenSTA icon indicating copy to clipboard operation
OpenSTA copied to clipboard

Thread safety race condition and proposed fix

Open kbieganski opened this issue 1 year ago • 2 comments

As was decided in a meeting, before introducing changes that improve multi-threaded scaling we are to improve thread safety. This PR adds fixes for the following data races:

  • Accessing tags_ and tag_groups_ in sta::Search while they're being reallocated in another thread. It's not enough to delay the swap, as reading from an array is two operations on x86 (load the pointer, then load the array element). The array may get swapped out and freed in between those.
  • Accessing and modifying bit fields from multiple threads. It's not possible to write individual bits to RAM, the best you can do is read a byte, do a bit operation on it, and write it back. Moreover, compilers can (and do) read up to 8 bytes, do bit operations on that, and write the whole 8 bytes back. If between reading and writing a field we're not even accessing (seemingly) changes, then the write-back overwrites that change.
  • Indexing and inserting in an ObjectTable or ArrayTable in sta::Graph from multiple threads.
  • A race between PathGroup::savable which accesses PathEnds and PathGroup::prune() which deletes PathEnds.

These changes incur a significant slowdown (about 25%).

Note that these are only the races found by ThreadSanitizer, not necessarily all of them.

kbieganski avatar Feb 28 '24 13:02 kbieganski

Here's a thread sanitizer log of BlackParrot GRT on master

Example warning:

WARNING: ThreadSanitizer: data race (pid=709934)
  Write of size 4 at 0x7b880057571c by thread T6 (mutexes: write M0):
    #0 sta::Vertex::setBfsInQueue(sta::BfsIndex, bool) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/graph/Graph.cc:1375 (openroad+0xe6c885) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #1 sta::BfsIterator::enqueue(sta::Vertex*) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:261 (openroad+0xf5beeb) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #2 sta::BfsFwdIterator::enqueueAdjacentVertices(sta::Vertex*, sta::SearchPred*, int) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:370 (openroad+0xf5c218) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #3 sta::BfsIterator::enqueueAdjacentVertices(sta::Vertex*) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:120 (openroad+0xf5a873) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #4 sta::GraphDelayCalc::findVertexDelay(sta::Vertex*, sta::ArcDelayCalc*, bool) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/dcalc/GraphDelayCalc.cc:597 (openroad+0xe6113a) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #5 sta::FindVertexDelays::visit(sta::Vertex*) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/dcalc/GraphDelayCalc.cc:229 (openroad+0xe61205) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #6 operator() /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:201 (openroad+0xf597a1) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #7 __invoke_impl<void, sta::BfsIterator::visitParallel(sta::Level, sta::VertexVisitor*)::<lambda(int)>&, int> /usr/include/c++/13.2.1/bits/invoke.h:61 (openroad+0xf597a1)
    #8 __invoke_r<void, sta::BfsIterator::visitParallel(sta::Level, sta::VertexVisitor*)::<lambda(int)>&, int> /usr/include/c++/13.2.1/bits/invoke.h:111 (openroad+0xf597a1)
    #9 _M_invoke /usr/include/c++/13.2.1/bits/std_function.h:290 (openroad+0xf597a1)
    #10 std::function<void (int)>::operator()(int) const /usr/include/c++/13.2.1/bits/std_function.h:591 (openroad+0x102741f) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #11 sta::DispatchQueue::dispatch_thread_handler(unsigned long) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/util/DispatchQueue.cc:100 (openroad+0x102741f)
    #12 void std::__invoke_impl<void, void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long>(std::__invoke_memfun_deref, void (sta::DispatchQueue::*&&)(unsigned long), sta::DispatchQueue*&&, unsigned long&&) /usr/include/c++/13.2.1/bits/invoke.h:74 (openroad+0x102844e) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #13 std::__invoke_result<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long>::type std::__invoke<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long>(void (sta::DispatchQueue::*&&)(unsigned long), sta::DispatchQueue*&&, unsigned long&&) /usr/include/c++/13.2.1/bits/invoke.h:96 (openroad+0x102844e)
    #14 void std::thread::_Invoker<std::tuple<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long> >::_M_invoke<0ul, 1ul, 2ul>(std::_Index_tuple<0ul, 1ul, 2ul>) /usr/include/c++/13.2.1/bits/std_thread.h:292 (openroad+0x102844e)
    #15 std::thread::_Invoker<std::tuple<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long> >::operator()() /usr/include/c++/13.2.1/bits/std_thread.h:299 (openroad+0x102844e)
    #16 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long> > >::_M_run() /usr/include/c++/13.2.1/bits/std_thread.h:244 (openroad+0x102844e)
    #17 execute_native_thread_routine /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:104 (libstdc++.so.6+0xe1942) (BuildId: 26d15cc010d7e7c8831795d6b4f135e1d1958ee1)

  Previous read of size 1 at 0x7b880057571f by thread T9:
    #0 sta::Vertex::bfsInQueue(sta::BfsIndex) const /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/graph/Graph.cc:1367 (openroad+0xe6c80e) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #1 sta::BfsIterator::enqueue(sta::Vertex*) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:257 (openroad+0xf5be61) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #2 sta::BfsFwdIterator::enqueueAdjacentVertices(sta::Vertex*, sta::SearchPred*, int) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:370 (openroad+0xf5c218) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #3 sta::BfsIterator::enqueueAdjacentVertices(sta::Vertex*) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:120 (openroad+0xf5a873) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #4 sta::GraphDelayCalc::findVertexDelay(sta::Vertex*, sta::ArcDelayCalc*, bool) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/dcalc/GraphDelayCalc.cc:597 (openroad+0xe6113a) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #5 sta::FindVertexDelays::visit(sta::Vertex*) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/dcalc/GraphDelayCalc.cc:229 (openroad+0xe61205) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #6 operator() /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/search/Bfs.cc:201 (openroad+0xf597a1) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #7 __invoke_impl<void, sta::BfsIterator::visitParallel(sta::Level, sta::VertexVisitor*)::<lambda(int)>&, int> /usr/include/c++/13.2.1/bits/invoke.h:61 (openroad+0xf597a1)
    #8 __invoke_r<void, sta::BfsIterator::visitParallel(sta::Level, sta::VertexVisitor*)::<lambda(int)>&, int> /usr/include/c++/13.2.1/bits/invoke.h:111 (openroad+0xf597a1)
    #9 _M_invoke /usr/include/c++/13.2.1/bits/std_function.h:290 (openroad+0xf597a1)
    #10 std::function<void (int)>::operator()(int) const /usr/include/c++/13.2.1/bits/std_function.h:591 (openroad+0x102741f) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #11 sta::DispatchQueue::dispatch_thread_handler(unsigned long) /home/krzysztof/work/OpenROAD-flow-scripts/tools/OpenROAD/src/sta/util/DispatchQueue.cc:100 (openroad+0x102741f)
    #12 void std::__invoke_impl<void, void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long>(std::__invoke_memfun_deref, void (sta::DispatchQueue::*&&)(unsigned long), sta::DispatchQueue*&&, unsigned long&&) /usr/include/c++/13.2.1/bits/invoke.h:74 (openroad+0x102844e) (BuildId: ef1a58ea73076503ffc3dd20d054940867fe35e6)
    #13 std::__invoke_result<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long>::type std::__invoke<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long>(void (sta::DispatchQueue::*&&)(unsigned long), sta::DispatchQueue*&&, unsigned long&&) /usr/include/c++/13.2.1/bits/invoke.h:96 (openroad+0x102844e)
    #14 void std::thread::_Invoker<std::tuple<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long> >::_M_invoke<0ul, 1ul, 2ul>(std::_Index_tuple<0ul, 1ul, 2ul>) /usr/include/c++/13.2.1/bits/std_thread.h:292 (openroad+0x102844e)
    #15 std::thread::_Invoker<std::tuple<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long> >::operator()() /usr/include/c++/13.2.1/bits/std_thread.h:299 (openroad+0x102844e)
    #16 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (sta::DispatchQueue::*)(unsigned long), sta::DispatchQueue*, unsigned long> > >::_M_run() /usr/include/c++/13.2.1/bits/std_thread.h:244 (openroad+0x102844e)
    #17 execute_native_thread_routine /usr/src/debug/gcc/gcc/libstdc++-v3/src/c++11/thread.cc:104 (libstdc++.so.6+0xe1942) (BuildId: 26d15cc010d7e7c8831795d6b4f135e1d1958ee1)

kbieganski avatar Apr 04 '24 14:04 kbieganski

This one has been fixed in OpenSTA. We are waiting until we uprev the next version then we will close this.

tspyrou avatar Apr 17 '24 00:04 tspyrou

Issues or PRs should be filed with https://github.com/parallaxsw/OpenSTA if still relevant. This is effectively a fork (though not strictly for historical reasons).

maliberty avatar Aug 12 '24 12:08 maliberty