DFFRAM
DFFRAM copied to clipboard
Some DFFRAM configurations have hold violations
After updating to the latest openlane (which changes STA from using an ideal clock to an actual clock), DFFRAM is seeing hold violations. A simple test:
./dffram.py --size 32x32
shows:
[WARNING]: There are hold violations in the design at the typical corner. Please refer to /mnt/dffram/build/32x32_DEFAULT/openlane/runs/RUN_2022.03.02_22.00.30/reports/routing/13-parasitics_sta.min.rpt.
The two DFFRAMs in Microwatt I'm using are (that both have hold violations):
./dffram.py --size 32x64 --variant 1RW1R --min-height 180
./dffram.py --size 512x64 --vertical-halo 100 --horizontal-halo 20
@shalan Can you update the handcrafted netlist?
@donn share with me the timing report(s) to investigate the cause of thes hold vios.
@shalan I'll have to harvest them- @antonblanchard do you have them on hand?
Here's an example when building a 512x64
DFFRAM. The clock makes it to the output buffering stage a long time before the memory elements.
======================= Typical Corner ===================================
Startpoint: Di0[1] (input port clocked by CLK)
Endpoint: BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.WORD[2].W.BYTE[0].B.BIT[1].genblk1.STORAGE
(positive level-sensitive latch clocked by CLK')
Path Group: CLK
Path Type: min
Corner: tt
Fanout Cap Slew Delay Time Description
-----------------------------------------------------------------------------
0.00 0.00 clock CLK (rise edge)
0.00 0.00 clock network delay (propagated)
2.24 2.24 v input external delay
0.47 0.31 2.56 v Di0[1] (in)
8 0.21 Di0[1] (net)
0.47 0.00 2.56 v BANK128[0].RAM128.DIBUF[1]/A (sky130_fd_sc_hd__clkbuf_16)
0.08 0.34 2.89 v BANK128[0].RAM128.DIBUF[1]/X (sky130_fd_sc_hd__clkbuf_16)
4 0.08 BANK128[0].RAM128.BLOCK[0].RAM32.Di0[1] (net)
0.08 0.02 2.91 v BANK128[0].RAM128.BLOCK[3].RAM32.DIBUF[1]/A (sky130_fd_sc_hd__clkbuf_16)
0.06 0.18 3.09 v BANK128[0].RAM128.BLOCK[3].RAM32.DIBUF[1]/X (sky130_fd_sc_hd__clkbuf_16)
32 0.07 BANK128[0].RAM128.BLOCK[3].RAM32.Di0_buf[1] (net)
0.06 0.00 3.09 v BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.WORD[2].W.BYTE[0].B.BIT[1].genblk1.STORAGE/D (sky130_fd_sc_hd__dlxtp_1)
3.09 data arrival time
0.00 0.00 clock CLK' (fall edge)
0.00 0.00 clock source latency
3.04 2.37 2.37 ^ CLK (in)
20 0.69 CLK (net)
3.40 0.00 2.37 ^ BANK128[0].RAM128.CLKBUF[3]/A (sky130_fd_sc_hd__clkbuf_4)
1.44 1.33 3.70 ^ BANK128[0].RAM128.CLKBUF[3]/X (sky130_fd_sc_hd__clkbuf_4)
10 0.51 BANK128[0].RAM128.BLOCK[3].RAM32.CLK (net)
1.47 0.19 3.90 ^ BANK128[0].RAM128.BLOCK[3].RAM32.CLKBUF/A (sky130_fd_sc_hd__clkbuf_2)
0.14 0.37 4.26 ^ BANK128[0].RAM128.BLOCK[3].RAM32.CLKBUF/X (sky130_fd_sc_hd__clkbuf_2)
4 0.02 BANK128[0].RAM128.BLOCK[3].RAM32.CLK_buf (net)
0.14 0.00 4.27 ^ BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.CLKBUF/A (sky130_fd_sc_hd__clkbuf_2)
0.13 0.20 4.47 ^ BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.CLKBUF/X (sky130_fd_sc_hd__clkbuf_2)
8 0.02 BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.CLK_buf (net)
0.13 0.00 4.47 ^ BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.WORD[2].W.CLKBUF/A (sky130_fd_sc_hd__clkbuf_4)
1.32 0.92 5.39 ^ BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.WORD[2].W.CLKBUF/X (sky130_fd_sc_hd__clkbuf_4)
16 0.45 BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.WORD[2].W.BYTE[0].B.CLK (net)
1.43 0.31 5.70 ^ BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.WORD[2].W.BYTE[0].B.genblk1.CLKINV/A (sky130_fd_sc_hd__inv_1)
0.20 0.15 5.85 v BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.WORD[2].W.BYTE[0].B.genblk1.CLKINV/Y (sky130_fd_sc_hd__inv_1)
1 0.00 BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.WORD[2].W.BYTE[0].B.CLK_B (net)
0.20 0.00 5.85 v BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.WORD[2].W.BYTE[0].B.genblk1.CG/CLK (sky130_fd_sc_hd__dlclkp_1)
0.14 0.30 6.15 v BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.WORD[2].W.BYTE[0].B.genblk1.CG/GCLK (sky130_fd_sc_hd__dlclkp_1)
8 0.03 BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.WORD[2].W.BYTE[0].B.GCLK (net)
0.14 0.00 6.15 v BANK128[0].RAM128.BLOCK[3].RAM32.SLICE[2].RAM8.WORD[2].W.BYTE[0].B.BIT[1].genblk1.STORAGE/GATE (sky130_fd_sc_hd__dlxtp_1)
0.25 6.40 clock uncertainty
0.00 6.40 clock reconvergence pessimism
0.00 6.40 library hold time
6.40 data required time
-----------------------------------------------------------------------------
6.40 data required time
-3.09 data arrival time
-----------------------------------------------------------------------------
-3.31 slack (VIOLATED)
@shalan
The hold vio is due to a bad constraint for this input-to-reg timing path. To get this fixed we need to adjust the driving cell constraint for input ports; it should be realistic; e.g., clkbuf_4 instead of inv_1. This would reduce the clock latency and slew at the input.
Also, I noticed a few minor issues in the clock tree; fixing them would make it more robust.
I discussed the fixes with @donn and they should be out soon.
Should be fixed now
Okay, so everything but 8x* and the register file should be good now. @antonblanchard Mind testing?
Thank you @donn, the cache RAMs (32x64_1RW1R) have no hold violations. My main RAM (512x64) still have hold violations unfortunately:
./dffram.py --size 512x64 --vertical-halo 100 --horizontal-halo 20
Fanout Cap Slew Delay Time Description
-----------------------------------------------------------------------------
0.00 0.00 clock CLK (rise edge)
0.00 0.00 clock network delay (propagated)
3.75 3.75 v input external delay
0.14 0.10 3.85 v Di0[1] (in)
8 0.22 Di0[1] (net)
0.16 0.00 3.85 v BANK128[1].RAM128.DIBUF[1]/A (sky130_fd_sc_hd__clkbuf_16)
0.07 0.21 4.06 v BANK128[1].RAM128.DIBUF[1]/X (sky130_fd_sc_hd__clkbuf_16)
4 0.07 BANK128[1].RAM128.BLOCK[0].RAM32.Di0[1] (net)
0.07 0.00 4.06 v BANK128[1].RAM128.BLOCK[0].RAM32.DIBUF[1]/A (sky130_fd_sc_hd__clkbuf_16)
0.06 0.17 4.23 v BANK128[1].RAM128.BLOCK[0].RAM32.DIBUF[1]/X (sky130_fd_sc_hd__clkbuf_16)
32 0.07 BANK128[1].RAM128.BLOCK[0].RAM32.Di0_buf[1] (net)
0.06 0.00 4.23 v BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[0].B.BIT[1].genblk1.STORAGE/D (sky130_fd_sc_hd__dlxtp_1)
4.23 data arrival time
0.00 0.00 clock CLK' (fall edge)
0.00 0.00 clock source latency
0.67 1.09 1.09 ^ CLK (in)
8 0.64 CLK (net)
1.87 0.00 1.09 ^ BANK128[1].RAM128.CLKBUF/A (sky130_fd_sc_hd__clkbuf_4)
0.25 0.60 1.69 ^ BANK128[1].RAM128.CLKBUF/X (sky130_fd_sc_hd__clkbuf_4)
8 0.08 BANK128[1].RAM128.BLOCK[0].RAM32.CLK (net)
0.25 0.00 1.69 ^ BANK128[1].RAM128.BLOCK[0].RAM32.CLKBUF/A (sky130_fd_sc_hd__clkbuf_4)
1.10 0.84 2.53 ^ BANK128[1].RAM128.BLOCK[0].RAM32.CLKBUF/X (sky130_fd_sc_hd__clkbuf_4)
5 0.37 BANK128[1].RAM128.BLOCK[0].RAM32.CLK_buf (net)
1.10 0.01 2.54 ^ BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.CLKBUF/A (sky130_fd_sc_hd__clkbuf_2)
0.14 0.35 2.88 ^ BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.CLKBUF/X (sky130_fd_sc_hd__clkbuf_2)
8 0.02 BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.CLK_buf (net)
0.14 0.00 2.88 ^ BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.CLKBUF/A (sky130_fd_sc_hd__clkbuf_4)
1.42 0.94 3.83 ^ BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.CLKBUF/X (sky130_fd_sc_hd__clkbuf_4)
16 0.49 BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[0].B.CLK (net)
1.60 0.41 4.23 ^ BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[0].B.genblk1.CLKINV/A (sky130_fd_sc_hd__inv_1)
0.21 0.16 4.39 v BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[0].B.genblk1.CLKINV/Y (sky130_fd_sc_hd__inv_1)
1 0.01 BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[0].B.CLK_B (net)
0.21 0.00 4.39 v BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[0].B.genblk1.CG/CLK (sky130_fd_sc_hd__dlclkp_1)
0.16 0.32 4.71 v BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[0].B.genblk1.CG/GCLK (sky130_fd_sc_hd__dlclkp_1)
8 0.03 BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[0].B.GCLK (net)
0.16 0.00 4.71 v BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[0].B.BIT[1].genblk1.STORAGE/GATE (sky130_fd_sc_hd__dlxtp_1)
0.25 4.96 clock uncertainty
0.00 4.96 clock reconvergence pessimism
0.01 4.97 library hold time
4.97 data required time
-----------------------------------------------------------------------------
4.97 data required time
-4.23 data arrival time
-----------------------------------------------------------------------------
-0.74 slack (VIOLATED)
gosh. @shalan weigh in?
Running STA across the entire design (including the 512x64 DFFRAM):
Startpoint: _131570_ (rising edge-triggered flip-flop clocked by user_clock2)
Endpoint: microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[4].B.BIT[5].genblk1.STORAGE
(positive level-sensitive latch clocked by user_clock2')
Path Group: user_clock2
Path Type: min
Corner: tt
Fanout Cap Slew Delay Time Description
-----------------------------------------------------------------------------
0.00 0.00 clock user_clock2 (rise edge)
0.00 0.00 clock source latency
0.43 0.31 0.31 ^ user_clock2 (in)
1 0.09 user_clock2 (net)
0.44 0.00 0.31 ^ repeater12/A (sky130_fd_sc_hd__buf_12)
0.42 0.37 0.68 ^ repeater12/X (sky130_fd_sc_hd__buf_12)
1 0.38 net630 (net)
0.47 0.11 0.78 ^ clkbuf_0_user_clock2/A (sky130_fd_sc_hd__clkbuf_16)
1.15 0.69 1.47 ^ clkbuf_0_user_clock2/X (sky130_fd_sc_hd__clkbuf_16)
16 1.20 clknet_0_user_clock2 (net)
1.42 0.41 1.88 ^ clkbuf_4_12_0_user_clock2/A (sky130_fd_sc_hd__clkbuf_2)
0.44 0.53 2.42 ^ clkbuf_4_12_0_user_clock2/X (sky130_fd_sc_hd__clkbuf_2)
2 0.08 clknet_4_12_0_user_clock2 (net)
0.44 0.01 2.42 ^ clkbuf_5_25__f_user_clock2/A (sky130_fd_sc_hd__clkbuf_16)
0.26 0.36 2.79 ^ clkbuf_5_25__f_user_clock2/X (sky130_fd_sc_hd__clkbuf_16)
10 0.25 clknet_5_25__leaf_user_clock2 (net)
0.26 0.03 2.81 ^ clkbuf_leaf_164_user_clock2/A (sky130_fd_sc_hd__clkbuf_16)
0.10 0.23 3.05 ^ clkbuf_leaf_164_user_clock2/X (sky130_fd_sc_hd__clkbuf_16)
17 0.08 clknet_leaf_164_user_clock2 (net)
0.10 0.00 3.05 ^ _131570_/CLK (sky130_fd_sc_hd__dfxtp_1)
0.03 0.30 3.35 v _131570_/Q (sky130_fd_sc_hd__dfxtp_1)
8 0.00 microwatt_0.soc0.bram.bram0.ram_0._4_[37] (net)
0.03 0.00 3.35 v microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.DIBUF[37]/A (sky130_fd_sc_hd__clkbuf_16)
0.07 0.16 3.52 v microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.DIBUF[37]/X (sky130_fd_sc_hd__clkbuf_16)
4 0.08 microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.Di0[37] (net)
0.07 0.00 3.52 v microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.DIBUF[37]/A (sky130_fd_sc_hd__clkbuf_16)
0.08 0.18 3.70 v microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.DIBUF[37]/X (sky130_fd_sc_hd__clkbuf_16)
32 0.09 microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.Di0_buf[37] (net)
0.09 0.01 3.72 v microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[4].B.BIT[5].genblk1.STORAGE/D (sky130_fd_sc_hd__dlxtp_1)
3.72 data arrival time
0.00 0.00 clock user_clock2' (fall edge)
0.00 0.00 clock source latency
0.43 0.34 0.34 ^ user_clock2 (in)
1 0.09 user_clock2 (net)
0.44 0.00 0.34 ^ repeater12/A (sky130_fd_sc_hd__buf_12)
0.42 0.41 0.75 ^ repeater12/X (sky130_fd_sc_hd__buf_12)
1 0.38 net630 (net)
0.47 0.12 0.86 ^ clkbuf_0_user_clock2/A (sky130_fd_sc_hd__clkbuf_16)
1.15 0.76 1.63 ^ clkbuf_0_user_clock2/X (sky130_fd_sc_hd__clkbuf_16)
16 1.20 clknet_0_user_clock2 (net)
1.40 0.43 2.06 ^ clkbuf_4_10_0_user_clock2/A (sky130_fd_sc_hd__clkbuf_2)
0.61 0.69 2.75 ^ clkbuf_4_10_0_user_clock2/X (sky130_fd_sc_hd__clkbuf_2)
2 0.11 clknet_4_10_0_user_clock2 (net)
0.61 0.03 2.78 ^ clkbuf_5_21__f_user_clock2/A (sky130_fd_sc_hd__clkbuf_16)
0.31 0.45 3.24 ^ clkbuf_5_21__f_user_clock2/X (sky130_fd_sc_hd__clkbuf_16)
9 0.30 clknet_5_21__leaf_user_clock2 (net)
0.34 0.07 3.31 ^ clkbuf_leaf_98_user_clock2/A (sky130_fd_sc_hd__clkbuf_16)
0.07 0.25 3.56 ^ clkbuf_leaf_98_user_clock2/X (sky130_fd_sc_hd__clkbuf_16)
13 0.05 clknet_leaf_98_user_clock2 (net)
0.07 0.00 3.56 ^ microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.CLKBUF/A (sky130_fd_sc_hd__clkbuf_4)
0.23 0.29 3.84 ^ microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.CLKBUF/X (sky130_fd_sc_hd__clkbuf_4)
8 0.08 microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.CLK (net)
0.23 0.00 3.85 ^ microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.CLKBUF/A (sky130_fd_sc_hd__clkbuf_4)
1.10 0.83 4.68 ^ microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.CLKBUF/X (sky130_fd_sc_hd__clkbuf_4)
5 0.37 microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.CLK_buf (net)
1.10 0.01 4.69 ^ microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.CLKBUF/A (sky130_fd_sc_hd__clkbuf_2)
0.14 0.35 5.03 ^ microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.CLKBUF/X (sky130_fd_sc_hd__clkbuf_2)
8 0.02 microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.CLK_buf (net)
0.14 0.00 5.03 ^ microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.CLKBUF/A (sky130_fd_sc_hd__clkbuf_4)
1.42 0.94 5.98 ^ microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.CLKBUF/X (sky130_fd_sc_hd__clkbuf_4)
16 0.49 microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[0].B.CLK (net)
1.54 0.34 6.31 ^ microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[4].B.genblk1.CLKINV/A (sky130_fd_sc_hd__inv_1)
0.21 0.16 6.47 v microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[4].B.genblk1.CLKINV/Y (sky130_fd_sc_hd__inv_1)
1 0.01 microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[4].B.CLK_B (net)
0.21 0.00 6.47 v microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[4].B.genblk1.CG/CLK (sky130_fd_sc_hd__dlclkp_1)
0.17 0.33 6.80 v microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[4].B.genblk1.CG/GCLK (sky130_fd_sc_hd__dlclkp_1)
8 0.03 microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[4].B.GCLK (net)
0.17 0.00 6.80 v microwatt_0.soc0.bram.bram0.ram_0.memory_0/BANK128[1].RAM128.BLOCK[0].RAM32.SLICE[0].RAM8.WORD[7].W.BYTE[4].B.BIT[5].genblk1.STORAGE/GATE (sky130_fd_sc_hd__dlxtp_1)
0.25 7.05 clock uncertainty
-0.15 6.89 clock reconvergence pessimism
0.00 6.90 library hold time
6.90 data required time
-----------------------------------------------------------------------------
6.90 data required time
-3.72 data arrival time
-----------------------------------------------------------------------------
-3.18 slack (VIOLATED)