OpenROAD icon indicating copy to clipboard operation
OpenROAD copied to clipboard

Detail route has long pause after 100% in nth iterations

Open oharboe opened this issue 1 year ago • 13 comments

Originally posted by @oharboe in https://github.com/The-OpenROAD-Project/OpenROAD/discussions/4598

Standalone reproduction case based on megaboom test branch

  1. untar https://drive.google.com/file/d/1fAHMqFNlZlq6iZY8Rl9GLjWJjz-WsiX6/view?usp=sharing
  2. Run NUM_CORES=16 ./run-me-TilePRCIDomain-asap7-base.sh
  3. Observe long pause in the 0th iteration after 100%:
[ca. 4 hours or so]
    Completing 100% with 79991 violations.
    elapsed time = 04:47:55, memory = 59083.30 (MB).
[several hours - what is happening here?]
[INFO DRT-0199]   Number of violations = 87967.
[continued output]

Running times for this TilePRCIDomain, just for reference:

Log                       Elapsed seconds
2_1_floorplan                    393
2_2_floorplan_io                  12
No elapsed time found in bazel-bin/logs/asap7/TilePRCIDomain/base/2_3_floorplan_tdms.log
2_4_floorplan_macro             1218
2_5_floorplan_tapcell            582
2_6_floorplan_pdn                451
3_1_place_gp_skip_io             677
3_2_place_iop                     19
3_3_place_gp                    6877
3_4_place_resized               1363
3_5_place_dp                    1511
4_1_cts                         5101
5_1_grt                        67447
5_2_fillcell                      70
5_3_route                     226679
Total                         312400

oharboe avatar Feb 01 '24 14:02 oharboe

I know the connectivity checker runs at that point but I'm not sure if that's the issue

maliberty avatar Feb 01 '24 16:02 maliberty

I know the connectivity checker runs at that point but I'm not sure if that's the issue

Would it make sense to add some idempotent progress output that is displayed if this takes much longer than normal?

That would allow deltaDebug to be used....

oharboe avatar Feb 01 '24 16:02 oharboe

How would you define 'much longer than normal' in an idempotent way?

maliberty avatar Feb 01 '24 16:02 maliberty

How would you define 'much longer than normal' in an idempotent way?

Good question. I don't even know what the code is doing at this point and it requires an understanding of what the code is doing and how to quantify progress in an idempotent way that is independent of number of cores running...

oharboe avatar Feb 01 '24 17:02 oharboe

As you asking for a progress message or a "more than normal" message?

maliberty avatar Feb 01 '24 17:02 maliberty

As you asking for a progress message or a "more than normal" message?

For deltaDebug a "more than normal" would do as long as it reduces the test case to something practical... Still a tricky requirement.

oharboe avatar Feb 01 '24 17:02 oharboe

I don't see any way to measure time in an idempotent way - it will vary too much. I also don't know what normal is. We could add a progress message.

maliberty avatar Feb 01 '24 19:02 maliberty

I don't see any way to measure time in an idempotent way - it will vary too much. I also don't know what normal is. We could add a progress message.

Maybe initially add some debug logging?

How else to debug something that takes ca 4 hours to reach?

oharboe avatar Feb 01 '24 19:02 oharboe

Sure though that won't be idempotent

maliberty avatar Feb 01 '24 19:02 maliberty

Sure though that won't be idempotent

Less critical that it is not noisy, so easier to make ot idempotent?

oharboe avatar Feb 01 '24 20:02 oharboe

Debug prints won't tell you anything about "more than normal' and are not so different from a progress message. Its not a matter of noise but of variable time that prevents idempotency.

maliberty avatar Feb 01 '24 20:02 maliberty

+1 to preserving hermiticity/idempotency as a top priority. We should just have an option to turn on progress messages/debug output.

QuantamHD avatar Feb 01 '24 20:02 QuantamHD

+1 to preserving hermiticity/idempotency as a top priority. We should just have an option to turn on progress messages/debug output.

Indeed.

Other than that, the concern is that something has to drive deltaDebug robustly.

oharboe avatar Feb 01 '24 20:02 oharboe