rsz: repair_timing says there are no violations while report_checks says the contrary
Describe the bug
The command:
report_checks -path_delay min -sort_by_slack -format full_clock_expanded
says:
Startpoint: STUSB_ISO_i/ftp_ctl_i/nvm_sect3_s_reg_0_
(rising edge-triggered flip-flop clocked by clock_32k)
Endpoint: NVM_512bE2LTa_H9A_i
(rising edge-triggered flip-flop clocked by clock_32k)
Path Group: clock_32k
Path Type: min
-4.39 slack (VIOLATED)
After several call to repair_timing, I get:
[INFO RSZ-0100] Repair move sequence: UnbufferMove SizeUpMove SwapPinsMove BufferMove CloneMove SplitLoadMove
[INFO RSZ-0098] No setup violations found
[INFO RSZ-0046] Found 2 endpoints with hold violations.
Iteration | Resized | Buffers | Cloned Gates | Area | WNS | TNS | Endpoint
--------------------------------------------------------------------------------------
0 | 5 | 0 | 0 | +0.0% | -3.738 | -10.139 | STUSB_ISO_i/i2c_top_i/u_i2c_state/sda_in_d1_reg/D
final | 5 | 2 | 0 | +0.0% | -3.738 | -9.839 | STUSB_ISO_i/i2c_top_i/u_i2c_state/sda_in_d1_reg/D
--------------------------------------------------------------------------------------
[INFO RSZ-0032] Inserted 2 hold buffers.
[INFO RSZ-0100] Repair move sequence: UnbufferMove SizeUpMove SwapPinsMove BufferMove CloneMove SplitLoadMove
[INFO RSZ-0098] No setup violations found
[INFO RSZ-0033] No hold violations found.
Then the command:
report_checks -path_delay min -sort_by_slack -format full_clock_expanded
says:
Startpoint: i_sda_in (clock source 'clock_sda_clk')
Endpoint: STUSB_ISO_i/i2c_top_i/u_i2c_state/sda_in_d1_reg
(rising edge-triggered flip-flop clocked by clock_i2c_1m)
Path Group: clock_i2c_1m
Path Type: min
...
-3.74 slack (VIOLATED)
Expected Behavior
Hold violations should be fixed.
Environment
OpenROAD v2.0-23570-g3fd4e08b52
To Reproduce
Test case repair_timing.zip uploaded to PII sftp server.
Relevant log output
Screenshots
No response
Additional Context
No response
I've tried to repair_timing after global routing based on orfs with:
global_route -verbose -allow_congestion
estimate_parasitics -global_routing
report_checks -path_delay max -sort_by_slack -format full_clock_expanded
report_checks -path_delay min -sort_by_slack -format full_clock_expanded
for {set i 0} {$i < 10} {incr i} {
if {[repair_timing] == 0} {
break
}
}
global_route -verbose -allow_congestion -start_incremental
detailed_placement
global_route -verbose -allow_congestion -end_incremental
estimate_parasitics -global_routing
report_checks -path_delay max -sort_by_slack -format full_clock_expanded
report_checks -path_delay min -sort_by_slack -format full_clock_expanded
but I get a similar result: before:
setup: 30.04 slack (MET)
hold: -5.34 slack (VIOLATED)
after:
setup: 29.84 slack (MET)
hold: -5.34 slack (VIOLATED)
@povik I was able to reproduce this and something seems wrong.
In @titan73's design the hold violation on STUSB_ISO_i/i2c_top_i/u_i2c_state/sda_in_d1_reg is ignored by repair_timing because the data pin is part of a clock tree and we explicitly ignore such pins in hold repair.
I've created a minimal example on sky130hd which I think is similar and has the same symptom. @titan73 To better understand your use case why is sda defined as a clock?
bug8075_reduced.v
module top(input wire scl, input wire sda,
output wire sda_sampled);
wire sda_buffered, scl_delayed;
sky130_fd_sc_hd__buf_2 buf1(
.A(sda),
.X(sda_buffered));
sky130_fd_sc_hd__clkdlybuf4s50_2 buf2(
.A(scl),
.X(scl_delayed));
sky130_fd_sc_hd__dfxtp_1 reg1(
.CLK(scl_delayed),
.D(sda_buffered),
.Q(sda_sampled));
endmodule
bug8075_reduced.sdc
create_clock -name clock_scl -period 10 [get_ports {scl}]
set_propagated_clock [get_clocks {clock_scl}]
create_clock -name clock_sda -period 10 [get_ports {sda}]
set_propagated_clock [get_clocks {clock_sda}]
set_clock_groups -name group -asynchronous \
-group [list [get_clocks {clock_scl}] [get_clocks {clock_sda}]]
set_input_delay 5 -clock [get_clocks {clock_scl}] -max -add_delay [get_ports {sda}]
bug8075_reduced.tcl
read_lef sky130hd/sky130hd.tlef
read_lef sky130hd/sky130hd_std_cell.lef
read_liberty sky130hd/sky130hd_tt.lib
read_verilog bug8075_reduced.v
link_design top
set_wire_rc -layer met2
read_sdc bug8075_reduced.sdc
report_checks -path_delay min -format full_clock_expanded
repair_timing
produces the following output:
OpenROAD v2.0-23489-gdb3f0f8f05
Features included (+) or not (-): +GPU +GUI +Python : Debug
This program is licensed under the BSD-3 license. See the LICENSE file for details.
Components of this program may be licensed under more restrictive licenses which must be honored.
[INFO ODB-0227] LEF file: sky130hd/sky130hd.tlef, created 13 layers, 25 vias
[INFO ODB-0227] LEF file: sky130hd/sky130hd_std_cell.lef, created 437 library cells
Startpoint: sda (clock source 'clock_sda')
Endpoint: reg1 (rising edge-triggered flip-flop clocked by clock_scl)
Path Group: clock_scl
Path Type: min
Delay Time Description
---------------------------------------------------------
0.00 0.00 clock clock_sda (rise edge)
0.00 0.00 clock source latency
0.00 0.00 ^ sda (in)
0.07 0.07 ^ buf1/X (sky130_fd_sc_hd__buf_2)
0.00 0.07 ^ reg1/D (sky130_fd_sc_hd__dfxtp_1)
0.07 data arrival time
0.00 0.00 clock clock_scl (rise edge)
0.00 0.00 clock source latency
0.00 0.00 ^ scl (in)
0.46 0.46 ^ buf2/X (sky130_fd_sc_hd__clkdlybuf4s50_2)
0.00 0.46 ^ reg1/CLK (sky130_fd_sc_hd__dfxtp_1)
0.00 0.46 clock reconvergence pessimism
-0.03 0.43 library hold time
0.43 data required time
---------------------------------------------------------
0.43 data required time
-0.07 data arrival time
---------------------------------------------------------
-0.36 slack (VIOLATED)
[WARNING EST-0027] no estimated parasitics. Using wire load models.
[INFO RSZ-0100] Repair move sequence: UnbufferMove SizeUpMove SwapPinsMove BufferMove CloneMove SplitLoadMove
[INFO RSZ-0098] No setup violations found
[INFO RSZ-0033] No hold violations found.
@povik Here is the answer from the designer: In our I2C slave design which doesn't use a high-speed clock to over-sample SCL. SDA is treated as a clock because it samples SCL to generate internal start and stop signals (which are also used as asynchronous resets). SDA and SCL have strict phase relationships defined by the I2C specification, making the paths synchronous with well-defined timing constraints.
Thanks. So in this design SDA and SCL both sample each other.
The isClock condition has been there since the inception of repair_hold (https://github.com/The-OpenROAD-Project/OpenROAD/commit/052034b4d470aee69339dd854d86aad3d22e719e). We might need special handling of hold fixing when the data comes from a clock network since I'm not sure the existing buffer insertion criteria will prevent new timing violations when editing a clock network (https://github.com/The-OpenROAD-Project/OpenROAD/blob/72c585d60d4a8bbf470858ce059133003bcdfd31/src/rsz/src/RepairHold.cc#L601-L612).
An easy handling would be to insert the delay buffer directly in front of the register whenever we're dealing with a clock network.
Is this due to any updates to the clock propagation with clock gating?
My 2 cents:
- If isClock is set, the gate should be "optimized" by CTS. But I don't think this is right in this case since it goes to a sequential input pin, not a clock pin.
- If isClock is not set, the optimizer should handle it.
Personally, I think #2 is the better way to go since CTS will be focusing on skew and this is a bit different.
The question is whether we do special-case unflagging of the isClock or make this a user task. In any case, it should be done in the clock propagation logic, in my opinion. I'm not sure how we would do it automatically to satisfy everyone's needs.
Is this due to any updates to the clock propagation with clock gating?
I don't think so. The recent changes were to cts while isClock is an sta API.
@povik I like this solution : An easy handling would be to insert the delay buffer directly in front of the register whenever we're dealing with a clock network.