microwatt icon indicating copy to clipboard operation
microwatt copied to clipboard

liteeth: Regenerate

Open antonblanchard opened this issue 3 years ago • 6 comments

Regenerate from upstream litex.

Signed-off-by: Anton Blanchard [email protected]

antonblanchard avatar Aug 04 '22 12:08 antonblanchard

Wow, this seems a lot bigger - 27064 LUTs on Arty A7-100 for my usual build including litesdcard vs. 25148 before, so nearly 2000 more LUTs, but 4 fewer block RAM tiles. It looks like we are using LUTs for memory now where we were previously using block RAMs:

After:

+----------------------------+-------+-------+-----------+-------+
|          Site Type         |  Used | Fixed | Available | Util% |
+----------------------------+-------+-------+-----------+-------+
| Slice LUTs                 | 27064 |     2 |     63400 | 42.69 |
|   LUT as Logic             | 23772 |     2 |     63400 | 37.50 |
|   LUT as Memory            |  3292 |     0 |     19000 | 17.33 |
|     LUT as Distributed RAM |  3290 |     0 |           |       |
|     LUT as Shift Register  |     2 |     0 |           |       |

Before:

+----------------------------+-------+-------+-----------+-------+
|          Site Type         |  Used | Fixed | Available | Util% |
+----------------------------+-------+-------+-----------+-------+
| Slice LUTs                 | 25148 |     2 |     63400 | 39.67 |
|   LUT as Logic             | 23586 |     2 |     63400 | 37.20 |
|   LUT as Memory            |  1562 |     0 |     19000 |  8.22 |
|     LUT as Distributed RAM |  1560 |     0 |           |       |
|     LUT as Shift Register  |     2 |     0 |           |       |

So 1732 LUTs being used as distributed RAM that weren't before. I don't think we want that unless there is some really good performance benefit to be gained.

paulusmack avatar Aug 05 '22 00:08 paulusmack

@paulusmack - That sounds like a bug in the upstream LiteEth that should be fixed.

mithro avatar Aug 05 '22 00:08 mithro

FYI - @enjoy-digital

mithro avatar Aug 05 '22 18:08 mithro

@paulusmack nice catch!

antonblanchard avatar Aug 07 '22 05:08 antonblanchard

I ran both versions through yosys synth_xilinx, and at least yosys infers the same number of memories:

baseline:

2.28. Executing MEMORY_LIBMAP pass (mapping memories to cells).
mapping memory liteeth_core.mem via $__XILINX_BLOCKRAM_SDP_
mapping memory liteeth_core.mem_1 via $__XILINX_BLOCKRAM_SDP_
mapping memory liteeth_core.mem_2 via $__XILINX_BLOCKRAM_TDP_
mapping memory liteeth_core.mem_3 via $__XILINX_BLOCKRAM_TDP_
mapping memory liteeth_core.storage via $__XILINX_LUTRAM_SDP_
mapping memory liteeth_core.storage_1 via $__XILINX_LUTRAM_SDP_
mapping memory liteeth_core.storage_2 via $__XILINX_LUTRAM_SDP_
mapping memory liteeth_core.storage_3 via $__XILINX_LUTRAM_SDP_
mapping memory liteeth_core.storage_4 via $__XILINX_LUTRAM_SDP_

updated:

2.28. Executing MEMORY_LIBMAP pass (mapping memories to cells).
mapping memory liteeth_core.mem via $__XILINX_BLOCKRAM_SDP_
mapping memory liteeth_core.mem_1 via $__XILINX_BLOCKRAM_SDP_
mapping memory liteeth_core.mem_2 via $__XILINX_BLOCKRAM_TDP_
mapping memory liteeth_core.mem_3 via $__XILINX_BLOCKRAM_TDP_
mapping memory liteeth_core.storage via $__XILINX_LUTRAM_SDP_
Extracted data FF from read port 0 of liteeth_core.storage: $\storage$rdreg[0]
mapping memory liteeth_core.storage_1 via $__XILINX_LUTRAM_SDP_
mapping memory liteeth_core.storage_2 via $__XILINX_LUTRAM_SDP_
Extracted data FF from read port 0 of liteeth_core.storage_2: $\storage_2$rdreg[0]
mapping memory liteeth_core.storage_3 via $__XILINX_LUTRAM_SDP_
mapping memory liteeth_core.storage_4 via $__XILINX_LUTRAM_SDP_

However, it seems like Vivado is no longer inferring the 2x~1500B RAMs. I looked for differences, and the only thing I saw was the main_sram163_re conditional. If I comment it out, Vivado infers the memories again:

//------------------------------------------------------------------------------
// Memory mem_2: 383-words x 32-bit
//------------------------------------------------------------------------------
// Port 0 | Read: Sync  | Write: ---- | 
// Port 1 | Read: Sync  | Write: Sync | Mode: Write-First | Write-Granularity: 8 
reg [31:0] mem_2[0:382];
reg [8:0] mem_2_adr0;
reg [8:0] mem_2_adr1;
always @(posedge sys_clk) begin
        //if (main_sram163_re)                      <--- here
                mem_2_adr0 <= main_sram161_adr;
end
always @(posedge sys_clk) begin
        if (main_sram2_we[0])
                mem_2[main_sram2_adr][7:0] <= main_sram2_dat_w[7:0];
        if (main_sram2_we[1])
                mem_2[main_sram2_adr][15:8] <= main_sram2_dat_w[15:8];
        if (main_sram2_we[2])
                mem_2[main_sram2_adr][23:16] <= main_sram2_dat_w[23:16];
        if (main_sram2_we[3])
                mem_2[main_sram2_adr][31:24] <= main_sram2_dat_w[31:24];
        mem_2_adr1 <= main_sram2_adr;
end
assign main_sram162_dat_r = mem_2[mem_2_adr0];
assign main_sram2_dat_r = mem_2[mem_2_adr1];

It looks like a read enable line, is it really needed?

antonblanchard avatar Aug 08 '22 05:08 antonblanchard

This looks stale, should we file an issue with LiteX ?

ozbenh avatar Oct 23 '22 04:10 ozbenh

I recently regenerate liteeth for all the platforms that support it as part of adding support for the ECPIX-5 board, and I don't see the LUT utilization issue, so it looks like upstream Litex has fixed whatever the problem was. I'll close this issue and submit a PR for my ecpix-5 branch, which has regenerated litesdcard, liteeth and litedram code in it.

paulusmack avatar Apr 09 '24 08:04 paulusmack