OpenROAD icon indicating copy to clipboard operation
OpenROAD copied to clipboard

rsz: repair_timing says there are no violations while report_checks says the contrary

Open titan73 opened this issue 4 months ago • 8 comments

Describe the bug

The command: report_checks -path_delay min -sort_by_slack -format full_clock_expanded says:

Startpoint: STUSB_ISO_i/ftp_ctl_i/nvm_sect3_s_reg_0_
            (rising edge-triggered flip-flop clocked by clock_32k)
Endpoint: NVM_512bE2LTa_H9A_i
          (rising edge-triggered flip-flop clocked by clock_32k)
Path Group: clock_32k
Path Type: min
          -4.39   slack (VIOLATED)

After several call to repair_timing, I get:

[INFO RSZ-0100] Repair move sequence: UnbufferMove SizeUpMove SwapPinsMove BufferMove CloneMove SplitLoadMove 
[INFO RSZ-0098] No setup violations found
[INFO RSZ-0046] Found 2 endpoints with hold violations.
Iteration | Resized | Buffers | Cloned Gates |   Area   |   WNS   |   TNS   | Endpoint
--------------------------------------------------------------------------------------
        0 |       5 |       0 |            0 |    +0.0% |  -3.738 | -10.139 | STUSB_ISO_i/i2c_top_i/u_i2c_state/sda_in_d1_reg/D
    final |       5 |       2 |            0 |    +0.0% |  -3.738 |  -9.839 | STUSB_ISO_i/i2c_top_i/u_i2c_state/sda_in_d1_reg/D
--------------------------------------------------------------------------------------
[INFO RSZ-0032] Inserted 2 hold buffers.
[INFO RSZ-0100] Repair move sequence: UnbufferMove SizeUpMove SwapPinsMove BufferMove CloneMove SplitLoadMove 
[INFO RSZ-0098] No setup violations found
[INFO RSZ-0033] No hold violations found.

Then the command: report_checks -path_delay min -sort_by_slack -format full_clock_expanded says:

Startpoint: i_sda_in (clock source 'clock_sda_clk')
Endpoint: STUSB_ISO_i/i2c_top_i/u_i2c_state/sda_in_d1_reg
          (rising edge-triggered flip-flop clocked by clock_i2c_1m)
Path Group: clock_i2c_1m
Path Type: min
...
          -3.74   slack (VIOLATED)

Expected Behavior

Hold violations should be fixed.

Environment

OpenROAD v2.0-23570-g3fd4e08b52

To Reproduce

Test case repair_timing.zip uploaded to PII sftp server.

Relevant log output


Screenshots

No response

Additional Context

No response

titan73 avatar Aug 20 '25 13:08 titan73

I've tried to repair_timing after global routing based on orfs with:

global_route -verbose -allow_congestion
estimate_parasitics -global_routing
report_checks -path_delay max -sort_by_slack -format full_clock_expanded
report_checks -path_delay min -sort_by_slack -format full_clock_expanded
for {set i 0} {$i < 10} {incr i} {
    if {[repair_timing] == 0} {
        break
    }
}
global_route -verbose -allow_congestion -start_incremental
detailed_placement
global_route -verbose -allow_congestion -end_incremental
estimate_parasitics -global_routing
report_checks -path_delay max -sort_by_slack -format full_clock_expanded
report_checks -path_delay min -sort_by_slack -format full_clock_expanded

but I get a similar result: before:

setup:          30.04   slack (MET)
hold:          -5.34   slack (VIOLATED)

after:

setup:          29.84   slack (MET)
hold:          -5.34   slack (VIOLATED)

titan73 avatar Aug 20 '25 18:08 titan73

@povik I was able to reproduce this and something seems wrong.

maliberty avatar Aug 22 '25 04:08 maliberty

In @titan73's design the hold violation on STUSB_ISO_i/i2c_top_i/u_i2c_state/sda_in_d1_reg is ignored by repair_timing because the data pin is part of a clock tree and we explicitly ignore such pins in hold repair.

I've created a minimal example on sky130hd which I think is similar and has the same symptom. @titan73 To better understand your use case why is sda defined as a clock?

bug8075_reduced.v

module top(input wire scl, input wire sda,
		output wire sda_sampled);
	wire sda_buffered, scl_delayed;

	sky130_fd_sc_hd__buf_2 buf1(
			.A(sda),
			.X(sda_buffered));

	sky130_fd_sc_hd__clkdlybuf4s50_2 buf2(
			.A(scl),
			.X(scl_delayed));

	sky130_fd_sc_hd__dfxtp_1 reg1(
			.CLK(scl_delayed),
			.D(sda_buffered),
			.Q(sda_sampled));
endmodule

bug8075_reduced.sdc

create_clock -name clock_scl -period 10 [get_ports {scl}]
set_propagated_clock [get_clocks {clock_scl}]
create_clock -name clock_sda -period 10 [get_ports {sda}]
set_propagated_clock [get_clocks {clock_sda}]

set_clock_groups -name group -asynchronous \
    -group [list [get_clocks {clock_scl}] [get_clocks {clock_sda}]]

set_input_delay 5 -clock [get_clocks {clock_scl}] -max -add_delay [get_ports {sda}]

bug8075_reduced.tcl

read_lef sky130hd/sky130hd.tlef
read_lef sky130hd/sky130hd_std_cell.lef
read_liberty sky130hd/sky130hd_tt.lib

read_verilog bug8075_reduced.v
link_design top
set_wire_rc -layer met2

read_sdc bug8075_reduced.sdc

report_checks -path_delay min -format full_clock_expanded
repair_timing

produces the following output:

OpenROAD v2.0-23489-gdb3f0f8f05 
Features included (+) or not (-): +GPU +GUI +Python : Debug
This program is licensed under the BSD-3 license. See the LICENSE file for details.
Components of this program may be licensed under more restrictive licenses which must be honored.
[INFO ODB-0227] LEF file: sky130hd/sky130hd.tlef, created 13 layers, 25 vias
[INFO ODB-0227] LEF file: sky130hd/sky130hd_std_cell.lef, created 437 library cells
Startpoint: sda (clock source 'clock_sda')
Endpoint: reg1 (rising edge-triggered flip-flop clocked by clock_scl)
Path Group: clock_scl
Path Type: min

  Delay    Time   Description
---------------------------------------------------------
   0.00    0.00   clock clock_sda (rise edge)
   0.00    0.00   clock source latency
   0.00    0.00 ^ sda (in)
   0.07    0.07 ^ buf1/X (sky130_fd_sc_hd__buf_2)
   0.00    0.07 ^ reg1/D (sky130_fd_sc_hd__dfxtp_1)
           0.07   data arrival time

   0.00    0.00   clock clock_scl (rise edge)
   0.00    0.00   clock source latency
   0.00    0.00 ^ scl (in)
   0.46    0.46 ^ buf2/X (sky130_fd_sc_hd__clkdlybuf4s50_2)
   0.00    0.46 ^ reg1/CLK (sky130_fd_sc_hd__dfxtp_1)
   0.00    0.46   clock reconvergence pessimism
  -0.03    0.43   library hold time
           0.43   data required time
---------------------------------------------------------
           0.43   data required time
          -0.07   data arrival time
---------------------------------------------------------
          -0.36   slack (VIOLATED)


[WARNING EST-0027] no estimated parasitics. Using wire load models.
[INFO RSZ-0100] Repair move sequence: UnbufferMove SizeUpMove SwapPinsMove BufferMove CloneMove SplitLoadMove 
[INFO RSZ-0098] No setup violations found
[INFO RSZ-0033] No hold violations found.

povik avatar Sep 04 '25 15:09 povik

@povik Here is the answer from the designer: In our I2C slave design which doesn't use a high-speed clock to over-sample SCL. SDA is treated as a clock because it samples SCL to generate internal start and stop signals (which are also used as asynchronous resets). SDA and SCL have strict phase relationships defined by the I2C specification, making the paths synchronous with well-defined timing constraints.

titan73 avatar Sep 05 '25 11:09 titan73

Thanks. So in this design SDA and SCL both sample each other.

The isClock condition has been there since the inception of repair_hold (https://github.com/The-OpenROAD-Project/OpenROAD/commit/052034b4d470aee69339dd854d86aad3d22e719e). We might need special handling of hold fixing when the data comes from a clock network since I'm not sure the existing buffer insertion criteria will prevent new timing violations when editing a clock network (https://github.com/The-OpenROAD-Project/OpenROAD/blob/72c585d60d4a8bbf470858ce059133003bcdfd31/src/rsz/src/RepairHold.cc#L601-L612).

An easy handling would be to insert the delay buffer directly in front of the register whenever we're dealing with a clock network.

povik avatar Sep 05 '25 12:09 povik

Is this due to any updates to the clock propagation with clock gating?

My 2 cents:

  1. If isClock is set, the gate should be "optimized" by CTS. But I don't think this is right in this case since it goes to a sequential input pin, not a clock pin.
  2. If isClock is not set, the optimizer should handle it.

Personally, I think #2 is the better way to go since CTS will be focusing on skew and this is a bit different.

The question is whether we do special-case unflagging of the isClock or make this a user task. In any case, it should be done in the clock propagation logic, in my opinion. I'm not sure how we would do it automatically to satisfy everyone's needs.

mguthaus avatar Sep 16 '25 12:09 mguthaus

Is this due to any updates to the clock propagation with clock gating?

I don't think so. The recent changes were to cts while isClock is an sta API.

povik avatar Sep 16 '25 13:09 povik

@povik I like this solution : An easy handling would be to insert the delay buffer directly in front of the register whenever we're dealing with a clock network.

tspyrou avatar Oct 30 '25 15:10 tspyrou