microwatt
microwatt copied to clipboard
liteeth: Regenerate
Wow, this seems a lot bigger - 27064 LUTs on Arty A7-100 for my usual build including litesdcard vs. 25148 before, so nearly 2000 more LUTs, but 4 fewer block RAM tiles. It looks like we are using LUTs for memory now where we were previously using block RAMs:
After:
+----------------------------+-------+-------+-----------+-------+
| Site Type | Used | Fixed | Available | Util% |
+----------------------------+-------+-------+-----------+-------+
| Slice LUTs | 27064 | 2 | 63400 | 42.69 |
| LUT as Logic | 23772 | 2 | 63400 | 37.50 |
| LUT as Memory | 3292 | 0 | 19000 | 17.33 |
| LUT as Distributed RAM | 3290 | 0 | | |
| LUT as Shift Register | 2 | 0 | | |
Before:
+----------------------------+-------+-------+-----------+-------+
| Site Type | Used | Fixed | Available | Util% |
+----------------------------+-------+-------+-----------+-------+
| Slice LUTs | 25148 | 2 | 63400 | 39.67 |
| LUT as Logic | 23586 | 2 | 63400 | 37.20 |
| LUT as Memory | 1562 | 0 | 19000 | 8.22 |
| LUT as Distributed RAM | 1560 | 0 | | |
| LUT as Shift Register | 2 | 0 | | |
So 1732 LUTs being used as distributed RAM that weren't before. I don't think we want that unless there is some really good performance benefit to be gained.
@paulusmack - That sounds like a bug in the upstream LiteEth that should be fixed.
FYI - @enjoy-digital
@paulusmack nice catch!
I ran both versions through yosys synth_xilinx, and at least yosys infers the same number of memories:
baseline:
2.28. Executing MEMORY_LIBMAP pass (mapping memories to cells).
mapping memory liteeth_core.mem via $__XILINX_BLOCKRAM_SDP_
mapping memory liteeth_core.mem_1 via $__XILINX_BLOCKRAM_SDP_
mapping memory liteeth_core.mem_2 via $__XILINX_BLOCKRAM_TDP_
mapping memory liteeth_core.mem_3 via $__XILINX_BLOCKRAM_TDP_
mapping memory liteeth_core.storage via $__XILINX_LUTRAM_SDP_
mapping memory liteeth_core.storage_1 via $__XILINX_LUTRAM_SDP_
mapping memory liteeth_core.storage_2 via $__XILINX_LUTRAM_SDP_
mapping memory liteeth_core.storage_3 via $__XILINX_LUTRAM_SDP_
mapping memory liteeth_core.storage_4 via $__XILINX_LUTRAM_SDP_
updated:
2.28. Executing MEMORY_LIBMAP pass (mapping memories to cells).
mapping memory liteeth_core.mem via $__XILINX_BLOCKRAM_SDP_
mapping memory liteeth_core.mem_1 via $__XILINX_BLOCKRAM_SDP_
mapping memory liteeth_core.mem_2 via $__XILINX_BLOCKRAM_TDP_
mapping memory liteeth_core.mem_3 via $__XILINX_BLOCKRAM_TDP_
mapping memory liteeth_core.storage via $__XILINX_LUTRAM_SDP_
Extracted data FF from read port 0 of liteeth_core.storage: $\storage$rdreg[0]
mapping memory liteeth_core.storage_1 via $__XILINX_LUTRAM_SDP_
mapping memory liteeth_core.storage_2 via $__XILINX_LUTRAM_SDP_
Extracted data FF from read port 0 of liteeth_core.storage_2: $\storage_2$rdreg[0]
mapping memory liteeth_core.storage_3 via $__XILINX_LUTRAM_SDP_
mapping memory liteeth_core.storage_4 via $__XILINX_LUTRAM_SDP_
However, it seems like Vivado is no longer inferring the 2x~1500B RAMs. I looked
for differences, and the only thing I saw was the main_sram163_re conditional.
If I comment it out, Vivado infers the memories again:
//------------------------------------------------------------------------------
// Memory mem_2: 383-words x 32-bit
//------------------------------------------------------------------------------
// Port 0 | Read: Sync | Write: ---- |
// Port 1 | Read: Sync | Write: Sync | Mode: Write-First | Write-Granularity: 8
reg [31:0] mem_2[0:382];
reg [8:0] mem_2_adr0;
reg [8:0] mem_2_adr1;
always @(posedge sys_clk) begin
//if (main_sram163_re) <--- here
mem_2_adr0 <= main_sram161_adr;
end
always @(posedge sys_clk) begin
if (main_sram2_we[0])
mem_2[main_sram2_adr][7:0] <= main_sram2_dat_w[7:0];
if (main_sram2_we[1])
mem_2[main_sram2_adr][15:8] <= main_sram2_dat_w[15:8];
if (main_sram2_we[2])
mem_2[main_sram2_adr][23:16] <= main_sram2_dat_w[23:16];
if (main_sram2_we[3])
mem_2[main_sram2_adr][31:24] <= main_sram2_dat_w[31:24];
mem_2_adr1 <= main_sram2_adr;
end
assign main_sram162_dat_r = mem_2[mem_2_adr0];
assign main_sram2_dat_r = mem_2[mem_2_adr1];
It looks like a read enable line, is it really needed?
This looks stale, should we file an issue with LiteX ?
I recently regenerate liteeth for all the platforms that support it as part of adding support for the ECPIX-5 board, and I don't see the LUT utilization issue, so it looks like upstream Litex has fixed whatever the problem was. I'll close this issue and submit a PR for my ecpix-5 branch, which has regenerated litesdcard, liteeth and litedram code in it.