Thread-safe `Network`
OpenSTA currently is not thread-safe.
If you run the OpenROAD flow in multi-threaded mode (pass OR_ARGS='-threads $(nproc)' to make) you can easily get a data race, often resulting in an infinite loop or segfault, e.g.:
Signal 11 received
Stack trace:
0# 0x00005574A338B47F in /home/krzysztof/OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad
1# 0x00007F80C145C710 in /usr/lib/libc.so.6
2# std::_Rb_tree_insert_and_rebalance(bool, std::_Rb_tree_node_base*, std::_Rb_tree_node_base*, std::_Rb_tree_node_base&) in /usr/lib/libstdc++.so.6
3# sta::Network::drivers(sta::Net const*) in /home/krzysztof/OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad
4# sta::GraphDelayCalc::findMultiDrvrNet(sta::Vertex*) in /home/krzysztof/OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad
5# sta::GraphDelayCalc::findDriverDelays(sta::Vertex*, sta::ArcDelayCalc*) in /home/krzysztof/OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad
6# sta::GraphDelayCalc::findVertexDelay(sta::Vertex*, sta::ArcDelayCalc*, bool) in /home/krzysztof/OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad
7# 0x00005574A36F3B46 in /home/krzysztof/OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad
8# sta::DispatchQueue::dispatch_thread_handler(unsigned long) in /home/krzysztof/OpenROAD-flow-scripts/tools/install/OpenROAD/bin/openroad
9# 0x00007F80C16E1943 in /usr/lib/libstdc++.so.6
10# 0x00007F80C14AA9EB in /usr/lib/libc.so.6
11# 0x00007F80C152E7CC in /usr/lib/libc.so.6
This is from the 5_1_grt target, not sure if it happens in other stages.
This stack trace starts in the dispatcher, but the real story is this:
void
GraphDelayCalc::findDelays(Level level)
{
...
FindVertexDelays visitor(this);
dcalc_count += iter_->visitParallel(level, &visitor);
...
As the name implies, BfsIterator::visitParallel is multi-threaded. It calls (in this case) FindVertexDelays::visit(...) from multiple threads.
void
FindVertexDelays::visit(Vertex *vertex)
{
graph_delay_calc1_->findVertexDelay(vertex, arc_delay_calc_, true);
}
void
GraphDelayCalc::findVertexDelay(Vertex *vertex,
ArcDelayCalc *arc_delay_calc,
bool propagate)
{
...
bool delay_changed = findDriverDelays(vertex, arc_delay_calc);
...
bool
GraphDelayCalc::findDriverDelays(Vertex *drvr_vertex,
ArcDelayCalc *arc_delay_calc)
{
bool delay_changed = false;
MultiDrvrNet *multi_drvr = findMultiDrvrNet(drvr_vertex);
...
MultiDrvrNet *
GraphDelayCalc::findMultiDrvrNet(Vertex *drvr_vertex)
...
const PinSet *drvrs = network_->drivers(drvr_vertex->pin());
...
PinSet *
Network::drivers(const Pin *pin)
{
...
return drivers(net);
...
Network::drivers(const Net*)inserts into what's basically anstd::map.
PinSet *
Network::drivers(const Net *net)
{
...
net_drvr_pin_map_[net] = drvrs;
...
This insertion is not thread-safe. This PR gets rid of this map in Network, as it seems to only be used there for caching. In ConcreteNetwork, it adds synchronization to code that accesses this map in a multi-threaded context.
Below is a performance regression check on 5_1_grt.
Ibex
| Command | Mean [s] | Min [s] | Max [s] | Relative |
|---|---|---|---|---|
ibex / 5_1_grt / master |
125.214 ± 0.776 | 123.797 | 126.203 | 1.02 ± 0.01 |
ibex / 5_1_grt / thread-safe-network |
122.379 ± 0.670 | 121.476 | 123.630 | 1.00 |
| Command | Mean [s] | Min [s] | Max [s] | Relative |
|---|---|---|---|---|
ibex / 5_1_grt / master |
124.140 ± 3.096 | 118.442 | 127.865 | 1.01 ± 0.03 |
ibex / 5_1_grt / thread-safe-network |
122.610 ± 0.516 | 121.760 | 123.394 | 1.00 |
BlackParrot
| Command | Mean [s] | Min [s] | Max [s] | Relative |
|---|---|---|---|---|
black_parrot / 5_1_grt / master |
834.350 ± 7.627 | 817.349 | 842.018 | 1.01 ± 0.01 |
black_parrot / 5_1_grt / thread-safe-network |
828.409 ± 6.578 | 816.466 | 837.040 | 1.00 |
This is no longer needed. Cherry pushed a fix which is directed at the root cause. This fix will show up when we next pull from the upstream.
Fixed upstream