coreir CoreIR clock gated registers don't get inferred by synthesis tools

When using the clock-gated registers output by CoreIR, the synthesis tools do not recognize this and put a correctly clock-gated register cell in the design. @mbstrange2 has more details.

In the memory tile this is something like a 40% reduction in power when using properly clock gated registers. Right now 80% of the power in Garnet is related to the clock and this is a large reason why.

Oct 05 '19 03:10 hofstee

Since coreir does a logical clock gating (using the CE to mux between the register's current value and the update value), it won't get inferred as a real clock gate that would be inferred by the always@ description: always @(posedge clk) begin if(CE) reg <= new_value; end

I simply replaced all Register_.... modules with the always descriptions, so it's a simple fix.

Oct 05 '19 04:10 mbstrange2

I can provide a coreir fix for this as well (basically Max's solution). Can you send me a simple example coreir file that has this issue?

Oct 06 '19 13:10 rdaly525

fifo.v

all the Register_has_* modules have this issue

Oct 07 '19 16:10 mbstrange2

After looking through the maze of code from Garnet through Gemstone, Mantle, Magma, and back to CoreIR, this is what I think needs to happen:

include/coreir/definitions/coreVerilog.hpp needs to be modified to add in register primitives that have enables and clock enables. The Verilog should look something like this:

module reg(input logic clk, rst_n,
           input logic [WIDTH-1:0]  d,
           output logic [WIDTH-1:0] d);

   always @(posedge clk, negedge rst_n) begin
      if (~rst_n) begin
         q <= init;
      end else begin
         if (clk_en && en) begin
            q <= d;
         end
      end
   end

endmodule

src/ir/headers/mantle.hpp needs to be modified to remove the code that creates muxes and use the new modules instead.

From there, mantle and magma need to be updated in a similar way so mantle doesn't create a similar mux for the clk_en.

Oct 10 '19 00:10 hofstee

From what I understand, the issue is that the synthesis tool won't recognize the clock gating pattern unless it's specified in a certain form (e.g. the above verilog always block style). This seems like a target specific issue, ideally in the long term we could discover the "logical" clock gating pattern (mux on the input to the register) and replace it with a specialized version that emits the above verilog code for synthesis quality (only when we're emitting verilog). It seems that logically these two descriptions are equivalent, but one has support for a "pattern matching" mechanism in the downstream tool.

however, for now I think providing a coreir primitive with this would work as temporary fix. @rdaly525 are you thinking we just add another register primitive (or change the existing one) that emits this kind of verilog (e.g. a reg with enable primitive). This could also facilitate mapping to FPGAs where the primitives typically have enables built in, so perhaps this actually does make sense to have as a built-in primitive (rather than as an optimization as I mentioned above), although that doesn't mean we can't have an optimization that transforms logical patterns into these more specialized forms.

Oct 10 '19 19:10 leonardt

Right, synthesis tools are tuned for this kind of behavioral description. When you explicitly instantiate a mux and use CE as select to that mux, synthesis will synthesize exactly that, it won't infer a real clock enable. current coreir does write enable on the register, not clock enable.

Oct 10 '19 19:10 mbstrange2

This is definitely a target-specific issue and should be taken care of in the backend opposed to changing the CoreIR primitives.

A good quick solution should be to define a mantle register module (declared in mantle.hpp) that has the specific interface with the CE. Then have a separate target-specific verilog definition for that mantle module with the appropriate always block. This is basically the same way we dealt with the CW floating point verilog.

Oct 10 '19 23:10 rdaly525

Thanks @norabarlow. One questions. If we do this soon, will it help with your paper? If so, we should bump the priority.

Nov 08 '19 20:11 phanrahan

It would be helpful, but not absolutely necessary. We have written a script to add in clock gating in the generated Verilog (not a perfect solution, but it performs the functionality we want).

Nov 08 '19 21:11 norabarlow

coreir coreir copied to clipboard

CoreIR clock gated registers don't get inferred by synthesis tools

coreir
coreir copied to clipboard