OpenROAD icon indicating copy to clipboard operation
OpenROAD copied to clipboard

dpo:edit commands of improvePlacement for better HPWL.

Open niv280 opened this issue 3 years ago • 29 comments

Signed-off-by: niv280 [email protected]

Hi, I change the set of command in improvePlacmentto to give better results. Here is comparision between the new set commands ( from the left ) to the old command ( in the right):

 [INFO ODB-0128] Design: aes_cipher_top	
Original HPWL           372201.2 u	Original HPWL           372201.2 u
Final HPWL              354025.3 u	Final HPWL              360418.2 u
Delta HPWL                  -4.9 %	Delta HPWL                  -3.2 %

[INFO ODB-0128] Design: gcd
Original HPWL             7709.2 u	Original HPWL             7709.2 u
Final HPWL                7154.5 u	Final HPWL                7322.4 u
Delta HPWL                  -7.2 %**	Delta HPWL                  -5.0 %

[INFO ODB-0128] Design: ibex_core
Original HPWL           652460.9 u	Original HPWL           652460.9 u
Final HPWL              636120.8 u	Final HPWL              639841.3 u
Delta HPWL                  -2.5 %	Delta HPWL                  -1.9 %

The most affect gain from increase the movePerCandidate from 20 originally to 100, even though the run time degraded I think it worth it, but that results is on small design so please advise if you think that is too much in bigger design.

Other change is change the order of the commands that gave better results. Also I notice that the mis and reorder method don't make much different at least on the tested design so I moved them to the end, but I also can comment them to reduce some run time.

niv280 avatar Oct 24 '22 15:10 niv280

Thanks, these results look nice. Please report the runtimes you observe. Would also run one larger design like swerv_wrapper to see the impact on something bigger.

Do you encounter any difficulties with routing the designs later in the flow?

maliberty avatar Oct 24 '22 15:10 maliberty

@akennings FYI for any comments

maliberty avatar Oct 24 '22 15:10 maliberty

Increasing the moves per candidate seems fine if the run time does not get out of control for larger designs. Changing the order of commands is also fine.

akennings avatar Oct 24 '22 16:10 akennings

If the def files varies run to run with the same parameters then that is a problem that needs investigating. We expect deterministic results.

maliberty avatar Oct 24 '22 17:10 maliberty

The problem was on my side, the results are deterministic.

The format code check are failing me, it's looks like it expecting change only in c++ files, this is true?

I still need to try to route the design with the new command, I will update about it tomorrow.

niv280 avatar Oct 24 '22 20:10 niv280

The format code check is not blocking for now. However it is helpful if you can run clang-format on the code to keep it up to date.

maliberty avatar Oct 24 '22 20:10 maliberty

Hi, I fixed the clang format error.

Also, I tried to route ibex design with the new command, here what it looked like: image

Here is with the old set of command: image

Unfortunately, the short and metal spacing violation was degraded with the new command from 4 to 21 but the overall wire length reduced. I will try to reduce the moves per candidate and see if I getting better results in the shorts violations.

I think it is also important to compare the congestion but I'm don't sure were I can check it, can you guide me?

niv280 avatar Oct 30 '22 06:10 niv280

Unfortunately, the short and metal spacing violation was degraded with the new command from 4 to 21 but the overall wire length reduced.

Interesting. I was wondering how we prevent DPO from undoing both timing and congestion based global placement. Should it incorporate the STA and grt data (like gpl currently does)?

antonblanchard avatar Oct 31 '22 02:10 antonblanchard

Unfortunately, the short and metal spacing violation was degraded with the new command from 4 to 21 but the overall wire length reduced.

Interesting. I was wondering how we prevent DPO from undoing both timing and congestion based global placement. Should it incorporate the STA and grt data (like gpl currently does)?

Right now we just hope it is ok. The changes tend to be fairly local and we hope that WL improvement is neutral to positive for timing. We could add some sort of try-reject strategy but that requires some sort of mechanism to be built for undo.

maliberty avatar Oct 31 '22 15:10 maliberty

To check congestion you can look at the congestion map in the GUI or dump it out with "gui::dump_heatmap name filename"

maliberty avatar Oct 31 '22 15:10 maliberty

The routing congestion looks pretty similar to me. The old commands: image

with new commands: image

What do you suggest for the next steps?

niv280 avatar Nov 03 '22 20:11 niv280

Perhaps it is something more local. The m3 shorts are new so perhaps take a look and see what is happening there. Is there increased local congestion that might explain the problem?

maliberty avatar Nov 03 '22 20:11 maliberty

@maliberty Maybe we should add a feature to compare the congestion values between two runs?

QuantamHD avatar Nov 03 '22 21:11 QuantamHD

@maliberty Maybe we should add a feature to compare the congestion values between two runs?

Sure though you can dump the heat map to a csv and do it easily externally.

maliberty avatar Nov 03 '22 21:11 maliberty

I think that I misunderstood the short in the design. I used that command for routing: detailed_route -output_drc dpo_tests/ibex_drc -output_maze dpo_tests/ibex.output.maze.log -output_guide_coverage dpo_tests/ibex.coverage.csv

and the output_drc file is empty, but in the last message in the tool I see the short that I upload above. Maybe the last drc report in the detailed_route is not the final one?

I tired also the check_drc command but it keep failed to me.

niv280 avatar Nov 04 '22 14:11 niv280

I checked it again the drc file is empty so as I understand it there is no drc in both of the command and the route succeed with the new command also there is small improvement in the power ( in aes design) .

Please advise for the next steps here.

niv280 avatar Nov 10 '22 20:11 niv280

@maliberty Do you have some advice?

QuantamHD avatar Nov 21 '22 19:11 QuantamHD

Would you include more of the router log. I wonder if you are looking at the right message or not.

maliberty avatar Nov 21 '22 19:11 maliberty

I attached log of 2 runs of ibex_sky130hd test, one with old command and one with new commands. To the test ( in flow.tcl) I added the improve_placement and check_placement proc to review the diff. Let me know if that not enough and you need me to add other command to the test.

Withe new commands: ibex_sky130hd.log

With old commands: ibex_sky130hd_old_commands.log

niv280 avatar Nov 26 '22 20:11 niv280

@niv280 I don't see actual detailed routing in either of the logs. It looks like pin access is running, but not detailed routing itself. To get the violation count, you should see something in the form of

[INFO DRT-0199]   Number of violations = ###.

rovinski avatar Nov 27 '22 05:11 rovinski

I think it was the verbose=0, here is the update logs:

With new commands: ibex_sky130hd.log

With old commands: ibex_sky130hd_old_commands.log

niv280 avatar Nov 27 '22 20:11 niv280

From the logs you posted, both versions converge to 0 violations, but the new one takes 11 iterations to do so compared to 10 previously.

rovinski avatar Nov 29 '22 07:11 rovinski

Yes, this is the downside of the new commands change but we are getting better HPWL

@maliberty what do you think?

niv280 avatar Nov 29 '22 18:11 niv280

So long as we can still get to 0 drvs that's ok. It is expected that denser designs will take longer to route.

maliberty avatar Nov 29 '22 19:11 maliberty

So, it required anything else to be approved? I'm just won't be available for the next 2 months from next week, so I want to close it.

niv280 avatar Nov 30 '22 07:11 niv280

@niv280 One thing for sure is that you need to adhere to the formatting rules. Please run clang-format on your code to pass the required formatting check.

rovinski avatar Nov 30 '22 15:11 rovinski

I pushed new commit with clang format fix and only the *.ok file changes.

niv280 avatar Dec 01 '22 17:12 niv280

I ran ORFS with this change and I see some degradation of results that need investigating (there are some more minor changes that we could just accept): nangate45/bp_multi - fails in grt due to congestion nangate45/jpeg - [ERROR] finish__timing__setup__ws fail test: -0.09 >= 0.0

maliberty avatar Dec 01 '22 17:12 maliberty

Maybe there should be less ambitious optimization e.g. 50 moves per candidate instead of 100? This is a good candidate for a sensitivity study / possibly throw at the autotuner.

rovinski avatar Dec 02 '22 15:12 rovinski