OpenROAD icon indicating copy to clipboard operation
OpenROAD copied to clipboard

gpl: slowdown when activating multi-threading

Open gudeh opened this issue 1 year ago • 4 comments

Description

This issue was discovered while addressing non-determinism with GPL when multi-threading was activated (https://github.com/The-OpenROAD-Project/OpenROAD/issues/5360).

I tested the GPL runtime using secure-CI (multiple designs) and locally via a GCP server with Nangate45/Swerv. The tests indicate that activating multi-threading results in the same runtime or, in some cases, even slows down GPL.

I believe the slowdown is due to too many context switches with minimal gain in parallelism, as the loops that use multi-threading typically perform simple mathematical operations.

  • It was mentioned that the OMP parallel loops inserted in GPL were implemented by Antmicro developers, who claimed they improved runtime. TODO: Investigate the code added using GitHub blame; are there any runtime reports available?
  • Investigate which loops are causing the slowdown. Measure runtime.

Suggested Solution

No response

Additional Context

No response

gudeh avatar Aug 20 '24 17:08 gudeh

I suggest you look at the PR as they often include data there. There may have been multiple changes.

maliberty avatar Aug 20 '24 20:08 maliberty

This is the PR: https://github.com/The-OpenROAD-Project/OpenROAD/pull/4580.

Hi @kbieganski, do you still see improvements in runtime for gpl with MT?

gudeh avatar Aug 21 '24 01:08 gudeh

It could be the case that the MT only helps on larger designs, and is a net negative on designs below a certain size threshold.

QuantamHD avatar Aug 21 '24 05:08 QuantamHD

I haven't observed a slowdown caused by multi-threading on 3_3_place_gp in ORFS:

ibex

Threads Min [s] Max [s] Median [s] Relative median
1 49.90 51.29 51.05 1.04
8 48.06 49.41 48.88 1.00

ariane133

Threads Min [s] Max [s] Median [s] Relative median
1 930.71 940.08 935.91 1.29
8 719.30 727.28 725.57 1.00

black_parrot

Threads Min [s] Max [s] Median [s] Relative median
1 1417.57 1432.21 1424.02 1.25
8 1134.33 1146.48 1139.92 1.00

This is on a version of OpenROAD from June.

I believe the slowdown is due to too many context switches

If you're observing too many context switches, perhaps you're running more threads than physical cores? Set your -threads (NUM_CORES in ORFS) to the number of physical cores you have. By default it uses nproc which, unfortunately, reports the logical cores.

kbieganski avatar Oct 08 '24 14:10 kbieganski