amaranth icon indicating copy to clipboard operation
amaranth copied to clipboard

Inferring True Dual-Port BRAMs with the new IR on Series 7 and ECP5

Open fischermoseley opened this issue 1 year ago • 2 comments

I'm trying to infer a true dual port BRAM on a Xilinx Series 7 chip, but I'm not having much luck. I've written the following to try to map the ReadPort/WritePort construct to the addr/din/dout/en/wea that's native to the RAM36 primitives:

from amaranth import *
from amaranth_boards.nexys4ddr import Nexys4DDRPlatform
from amaranth.sim import Simulator
from random import randint

class TrueDualPort(Elaboratable):
    def __init__(self, width, depth):
        self.mem = Memory(
            width=width,
            depth=depth,
            init = [randint(0, 2**width-1) for _ in range(depth)])

        self.addra = Signal(range(depth))
        self.dina = Signal(width)
        self.douta = Signal(width)
        self.wea = Signal()
        self.ena = Signal()

        self.addrb = Signal(range(depth))
        self.dinb = Signal(width)
        self.doutb = Signal(width)
        self.web = Signal()
        self.enb = Signal()

    def elaborate(self, platform):
        m = Module()

        m.submodules["mem"] = self.mem
        rp0 = self.mem.read_port()
        wp0 = self.mem.write_port()

        rp1 = self.mem.read_port()
        wp1 = self.mem.write_port()

        m.d.comb += rp0.addr.eq(self.addra)
        m.d.comb += wp0.addr.eq(self.addra)
        m.d.comb += wp0.data.eq(self.dina)
        m.d.comb += self.douta.eq(rp0.data)
        m.d.comb += rp0.en.eq(self.ena)
        m.d.comb += wp0.en.eq(self.wea & self.ena)

        m.d.comb += rp1.addr.eq(self.addrb)
        m.d.comb += wp1.addr.eq(self.addrb)
        m.d.comb += wp1.data.eq(self.dinb)
        m.d.comb += self.doutb.eq(rp1.data)
        m.d.comb += rp1.en.eq(self.enb)
        m.d.comb += wp1.en.eq(self.web & self.enb)
        return m

And if I create a design in the following format, Vivado will recognize it as a True Dual Port BRAM. I'm building for the Nexys4DDR, which has 16 switches and 16 LEDs, and I'm using the top half of each for Port A, and the bottom half for Port B, just to prevent the memory from being optimized out.

class TrueDualPortTest(Elaboratable):
    def elaborate(self, platform):
        m = Module()
        m.submodules["tdp"] = tdp = TrueDualPort(8, 4096)

        sw_pins = Cat([platform.request("switch",i).i for i in range(8)])
        led_pins = Cat([platform.request("led",i).o for i in range(8)])

        counter = Signal(8)
        m.d.sync += counter.eq(counter + 1)

        # ---- Port A ----
        m.d.comb += tdp.ena.eq(1)
        m.d.comb += tdp.wea.eq(1)
        m.d.sync += tdp.dina.eq(counter) # <- uncommenting this makes Vivado unable to recognize it as a TDP!
        m.d.sync += tdp.addra.eq(sw_pins[8:])
        m.d.sync += led_pins[8:].eq(tdp.douta)


        # ---- Port B ----
        m.d.comb += tdp.enb.eq(1)
        m.d.comb += tdp.web.eq(0)
        m.d.sync += tdp.dinb.eq(counter + 3) #  <- uncommenting this makes Vivado unable to recognize it as a TDP!
        m.d.sync += tdp.addrb.eq(sw_pins[:8])
        m.d.sync += led_pins[:8].eq(tdp.doutb)

        return m

If I build this with Nexys4DDRPlatform().build(TrueDualPortTest()), then Vivado will happily recognize this as a true dual port BRAM, as is seen in the logs:

INFO: [Synth 8-3971] The signal "\top.tdp :/mem_reg" was recognized as a true dual port RAM template.

And the post-placement resource utilization report will show that block ram is being used. However, if I uncomment either (or both) of the two lines marked in the source above - Vivado will map everything to distributed RAM instead. This seems to hold true if I keep increasing the depth of the RAM.

I also noticed that if I assign the en and we signals synchronously instead of combinationally, Vivado will throw an Unrecognized RAM template error.

My question is: How do I create a true-dual port RAM in Amaranth? I'd like to avoid directly instantiating an inferred BRAM template, since I'm not able to use the built-in simulator on external Verilog.

Thanks everyone!

Related Info:

I did notice that a while back the Yosys memory interface was reworked and support for TDP memories was added - I'm not sure if that has any implications for Amaranth, though. https://github.com/YosysHQ/yosys/issues/1959

And that appears to be reflected in the yosys docs: https://yosyshq.readthedocs.io/projects/yosys/en/latest/CHAPTER_Memorymap.html

fischermoseley avatar Jan 03 '24 00:01 fischermoseley

@mwkmwkmwk Could you take a look, please?

whitequark avatar Jan 03 '24 08:01 whitequark

I noticed that RFC 45 was implemented in 890e099ec3450306bc841311365d09e932d1b46f so I thought that I'd give this another go with newlib.memory.Memory module, and I'm noticing some interesting behavior. Using the Amaranth release from PyPI, I wasn't able to infer a TDP RAM when building for Series 7 parts, but I could for ECP5 devices. Now if use the latest from git (a586df89ad43bfe55dcad207bae9f7c8106046fe at the time of writing), it's flipped: I'm able to infer a TDP RAM for the Series 7, but not for the ECP5 device anymore! This previously worked, so I believe this is a regression from the last release.

I've tried my best to elaborate on either case below. I've installed Amaranth with its builtin Yosys, and I'm using Vivado v2023.1 for building for the Series 7, and Yosys 0.38+92 and nextpnr-ecp5 0.7-11-g05ed9308 for the ECP5.

I've provided the full source at this gist.

Successfully synthesizing a TDP RAM for the Series 7, but not the ECP5.

If I use a586df89ad43bfe55dcad207bae9f7c8106046fe, Vivado correctly infers a TDP RAM, as seen in top.log:

INFO: [Synth 8-3971] The signal "\top.memory :/mem_reg" was recognized as a true dual port RAM template.

And it chooses to implement it in Block Memory, as seen in top_utilization_place.rpt:

+-------------------+------+-------+------------+-----------+-------+
|     Site Type     | Used | Fixed | Prohibited | Available | Util% |
+-------------------+------+-------+------------+-----------+-------+
| Block RAM Tile    |  0.5 |     0 |          0 |       135 |  0.37 |
|   RAMB36/FIFO*    |    0 |     0 |          0 |       135 |  0.00 |
|   RAMB18          |    1 |     0 |          0 |       270 |  0.37 |
|     RAMB18E1 only |    1 |       |            |           |       |
+-------------------+------+-------+------------+-----------+-------+

Yosys however is unable to map the memory to a TDP RAM, and instead implements everything combinationally:

Info: Device utilisation:
Info: 	          TRELLIS_IO:     4/  245     1%
Info: 	                DCCA:     1/   56     1%
Info: 	              DP16KD:     0/  108     0%
Info: 	          MULT18X18D:     0/   72     0%
Info: 	              ALU54B:     0/   36     0%
Info: 	             EHXPLLL:     0/    4     0%
Info: 	             EXTREFB:     0/    2     0%
Info: 	                DCUA:     0/    2     0%
Info: 	           PCSCLKDIV:     0/    2     0%
Info: 	             IOLOGIC:     0/  160     0%
Info: 	            SIOLOGIC:     0/   85     0%
Info: 	                 GSR:     1/    1   100%
Info: 	               JTAGG:     0/    1     0%
Info: 	                OSCG:     0/    1     0%
Info: 	               SEDGA:     0/    1     0%
Info: 	                 DTR:     0/    1     0%
Info: 	             USRMCLK:     0/    1     0%
Info: 	             CLKDIVF:     0/    4     0%
Info: 	           ECLKSYNCB:     0/   10     0%
Info: 	             DLLDELD:     0/    8     0%
Info: 	              DDRDLL:     0/    4     0%
Info: 	             DQSBUFM:     0/   10     0%
Info: 	     TRELLIS_ECLKBUF:     0/    8     0%
Info: 	        ECLKBRIDGECS:     0/    2     0%
Info: 	                DCSC:     0/    2     0%
Info: 	          TRELLIS_FF: 16625/43848    37%
Info: 	        TRELLIS_COMB: 57328/43848   130%
Info: 	        TRELLIS_RAMW:     0/ 5481     0%

This uses more resources than the ECP5 has available, so naturally nextpnr throws an error:

ERROR: Unable to place cell 'memory.douta_TRELLIS_FF_Q_DI_LUT4_Z_1_A_PFUMX_Z_C0_PFUMX_Z_C0_LUT4_Z_1_A_PFUMX_Z_C0_LUT4_Z', no BELs remaining to implement cell type 'TRELLIS_COMB'

Successfully synthesizing a TDP RAM for the ECP5, but not the Series 7.

If I use the latest stable release of Amaranth from PyPI, Vivado is unable to infer the RAM type, and produces this error before the synthesis run fails a bit later:

ERROR: [Synth 8-2914] Unsupported RAM template [/home/fischerm/memory_testing/build/top.v:1873] ERROR: [Synth 8-5743] Unable to infer RAMs due to unsupported pattern.

Yosys however has no troubles, and maps to a TDP Block RAM, and the design fits inside the ECP5:

Info: Device utilisation:
Info: 	          TRELLIS_IO:     4/  245     1%
Info: 	                DCCA:     1/   56     1%
Info: 	              DP16KD:     1/  108     0%
Info: 	          MULT18X18D:     0/   72     0%
Info: 	              ALU54B:     0/   36     0%
Info: 	             EHXPLLL:     0/    4     0%
Info: 	             EXTREFB:     0/    2     0%
Info: 	                DCUA:     0/    2     0%
Info: 	           PCSCLKDIV:     0/    2     0%
Info: 	             IOLOGIC:     0/  160     0%
Info: 	            SIOLOGIC:     0/   85     0%
Info: 	                 GSR:     1/    1   100%
Info: 	               JTAGG:     0/    1     0%
Info: 	                OSCG:     0/    1     0%
Info: 	               SEDGA:     0/    1     0%
Info: 	                 DTR:     0/    1     0%
Info: 	             USRMCLK:     0/    1     0%
Info: 	             CLKDIVF:     0/    4     0%
Info: 	           ECLKSYNCB:     0/   10     0%
Info: 	             DLLDELD:     0/    8     0%
Info: 	              DDRDLL:     0/    4     0%
Info: 	             DQSBUFM:     0/   10     0%
Info: 	     TRELLIS_ECLKBUF:     0/    8     0%
Info: 	        ECLKBRIDGECS:     0/    2     0%
Info: 	                DCSC:     0/    2     0%
Info: 	          TRELLIS_FF:   243/43848     0%
Info: 	        TRELLIS_COMB:   542/43848     1%
Info: 	        TRELLIS_RAMW:     0/ 5481     0%

Would be super interested in any thoughts on this. I'm also happy to attempt implementing a solution, if one is known :)

Thank you all! -Fischer

fischermoseley avatar Feb 25 '24 22:02 fischermoseley

Well done @wanda-phi!

whitequark avatar May 09 '24 03:05 whitequark