circt icon indicating copy to clipboard operation
circt copied to clipboard

[FIRRTL] Add lowering option to inline memory wrapper modules (#5681)

Open sterin opened this issue 1 year ago • 8 comments

This PR addresses issue #5681.

When the new inlineMemoryWrapperModules is enabled, at the end HWMemSimImpl pass, it inlines the newly created memory wrapper modules and then removes them.

Input FIRRTL (from the issue):

FIRRTL version 3.0.0
circuit Foo:
  module Foo:
    input r: {addr: UInt<3>, en: UInt<1>, clk: Clock, flip data: UInt<32>}
    input w: {addr: UInt<3>, en: UInt<1>, clk: Clock, data: UInt<32>, mask: UInt<1>}

    mem memory :
      data-type => UInt<32>
      depth => 8
      read-latency => 1
      write-latency => 1
      reader => r
      writer => w
      read-under-write => undefined


    connect memory.r, r
    connect memory.w, w

Default output Verilog. The memory is inside a newly created memory_8x32 module:

module memory_8x32(
  input  [2:0]  R0_addr,
  input         R0_en,
                R0_clk,
  output [31:0] R0_data,
  input  [2:0]  W0_addr,
  input         W0_en,
                W0_clk,
  input  [31:0] W0_data
);

  reg [31:0] Memory[0:7];
  reg        _R0_en_d0;
  reg [2:0]  _R0_addr_d0;
  always @(posedge R0_clk) begin
    _R0_en_d0 <= R0_en;
    _R0_addr_d0 <= R0_addr;
  end // always @(posedge)
  always @(posedge W0_clk) begin
    if (W0_en & 1'h1)
      Memory[W0_addr] <= W0_data;
  end // always @(posedge)
  assign R0_data = _R0_en_d0 ? Memory[_R0_addr_d0] : 32'bx;
endmodule

module Foo(
  input  [2:0]  r_addr,
  input         r_en,
                r_clk,
  output [31:0] r_data,
  input  [2:0]  w_addr,
  input         w_en,
                w_clk,
  input  [31:0] w_data,
  input         w_mask
);

  memory_8x32 memory_ext (
    .R0_addr (r_addr),
    .R0_en   (r_en),
    .R0_clk  (r_clk),
    .R0_data (r_data),
    .W0_addr (w_addr),
    .W0_en   (w_en & w_mask),
    .W0_clk  (w_clk),
    .W0_data (w_data)
  );
endmodule

With the inlineMemoryWrapperModules lowering option, the memory is inside the parent module.

module Foo(
  input  [2:0]  r_addr,
  input         r_en,
                r_clk,
  output [31:0] r_data,
  input  [2:0]  w_addr,
  input         w_en,
                w_clk,
  input  [31:0] w_data,
  input         w_mask
);

  reg [31:0] memory_ext_Memory[0:7];
  reg        memory_ext__R0_en_d0;
  reg [2:0]  memory_ext__R0_addr_d0;
  always @(posedge r_clk) begin
    memory_ext__R0_en_d0 <= r_en;
    memory_ext__R0_addr_d0 <= r_addr;
  end // always @(posedge)
  always @(posedge w_clk) begin
    if (w_en & w_mask & 1'h1)
      memory_ext_Memory[w_addr] <= w_data;
  end // always @(posedge)
  assign r_data =
    memory_ext__R0_en_d0 ? memory_ext_Memory[memory_ext__R0_addr_d0] : 32'bx;
endmodule

Feedback will be highly appreceiate, I'm not very familiar with the LLVM/MLIR/CIRCT conventions.

A few qustions:

  • Should this really be a lowering option rather than a firtool option?
  • Is the name of the option appropriate?

sterin avatar Nov 27 '23 22:11 sterin

Thanks! I've rebased the pull request branch and implemented the above changes. It is now an option --inline-mem.

Thanks! I think the option makes sense. One question I have is how to handle inner symbols on instances. An easiest solution is to just give up inlining for instances with symbols.

To be honest, I haven't dug deep enough yet to fully understand what innner symbols are. However, should there be any inner symbols on these newly created instances of thse newly created memory wrapper modules?

sterin avatar Nov 28 '23 20:11 sterin

To be honest, I haven't dug deep enough yet to fully understand what innner symbols are. However, should there be any inner symbols on these newly created instances of thse newly created memory wrapper modules?

There is a (hopefully up-to-date) rationale doc for them: https://circt.llvm.org/docs/RationaleSymbols/

Sorry for the delayed review. Adding this generally makes sense. I can try to sneak a review in tonight.

seldridge avatar Nov 28 '23 21:11 seldridge

One thing that I missed with the original issue is that there may be a mechanism to enable something equivalent to the effect of this by turning on the MemToRegOfVec pass with an annotation. That API isn't great, but it is already load bearing. If this is being used for Chisel's combinational-read Mem (which maps to a FIRRTL memory with read-latency=0 and write-latency=1), then you can directly use a Reg(Vec(_, _)) which is basically the same as Mem.

That said, the approach in the PR, though duplicative, seems reasonable as a lowering option. Not everything in CIRCT is guaranteed to have access to the FIRRTL pipeline.

seldridge avatar Nov 28 '23 22:11 seldridge

I've noticed another problem with the pull request. The pass ends up with a verification failure when there's a firrtl.annotations.LoadMemoryAnnotation affecting the memory. I'll investigate a bit.

sterin avatar Nov 30 '23 18:11 sterin

Reading through, this PR is not ready, correct?

darthscsi avatar Dec 08 '23 16:12 darthscsi

Not yet, no.

It had problems multiple memories per module, and with out-of-line file initialization of memories. Multiple memories is easily solved, but out-of-line file initialization is complicated to handle when simpling inlining the previously generated wrapper modules.

I'll update this pull request with minimal changes that (1) fix the problem with multiple memories per module, and (2) disables inlining when the memory has out-of-line file initialization. In-line file intializaion works, so this should not be a big problem.

I'm also working on more elaborate pull request that instead of creating the wrapper modules and then inlining them, creates the registers inside the parent module in the first place.

sterin avatar Dec 11 '23 17:12 sterin

While testing I've discovered a few more problems:

This latest commit resolves the following problems:

  • Multiple memories in the same module did not work because inner symbols created for randomization were duplicated. I’ve made sure to prefix them.

  • The inner symbols in the sv::VerbatimOp nodes in the randomization code had to be prefixed as well so that randomization will continue to work.

  • Out-of-line file initialization (firrtl.annotations.LoadMemoryAnnotation) was too complex for me to handle as a module inlining operation.

I also added a few tests as suggested.

sterin avatar Dec 13 '23 18:12 sterin

@seldridge Could you review the latest commits?

sterin avatar Jan 04 '24 18:01 sterin