spatial-lang icon indicating copy to clipboard operation
spatial-lang copied to clipboard

Use URAMs on F1

Open shadjis opened this issue 7 years ago • 6 comments

The F1 has UltraRAMs which can be used for larger SRAMs. However, SRAMs need to be explicitly assigned to URAMs using the following syntax:

(* ram_style = "ultra" *) reg [DWIDTH-1:0] mem [0:WORDS-1];

One way to do this is to have an analysis pass which:

  1. gets a list of all SRAMs,
  2. sorts the list by size, and
  3. keeps track of the largest 800 SRAMs (there are 800 URAMs on the F1)

E.g. this can be done by storing the size of the 800th largest SRAM and then in code generation using a different template for SRAMs larger than that.

shadjis avatar Oct 04 '17 05:10 shadjis

A single SRAM may take more than 1 URAM, but other than that yep this should work.

Are there any downsides to using a URAM over an SRAM (e.g. not dual ported, higher latency, etc.)?

dkoeplin avatar Oct 04 '17 06:10 dkoeplin

Latency should be the same (1-cycle), and URAMs are dual ported. However, the width of URAM ports is twice the width of BRAMs (72 bits), and this cannot be configured to operate as a smaller width. In other words, we do not get more depth with URAM if we use a narrower width, unlike BRAMs. This means that without fixing #231 , URAM usage will be quite inefficient for narrower data types.

raghup17 avatar Oct 04 '17 06:10 raghup17

Also David, the case of 1 SRAM being >1 URAM may be a bit complicated since URAMs can be cascaded to implement bigger URAMs. However, enabling cascading reduces the number of URAMs available: https://github.com/aws/aws-fpga/blob/master/hdk/cl/examples/cl_uram_example/README.md#implementation-options

If cascading is not enabled, and something > 4096 words is given a uram directive, I'm not sure if this will:

  1. still use multiple URAMs but non-dedicated routing (e.g. may impact timing and routability),
  2. use block rams instead, or
  3. fail

If case 1 or 2 it should be ok but if 3 then we might want to omit SRAMs > 4096 from this URAM list. But I think for now we can just assume 1 SRAM per URAM and handle this more complicated case later? Also, as Raghu said 4096 is the depth without packing into the 72-bit word width (#231), so it might actually be 8k or 16k words. This might be larger than anything we ever need so cascading may not be necessary.

shadjis avatar Oct 04 '17 18:10 shadjis

Ah interesting, thanks for pointing this out. In that case, if I see a bank larger than 4096 words I won't include it in the URAM candidate list for now. This doesn't happen extremely often in practice, so this simple solution should work ok for now.

dkoeplin avatar Oct 04 '17 21:10 dkoeplin

Do we have the metadata yet that tells me if I should uramify a memory?

mattfel1 avatar Oct 05 '17 17:10 mattfel1

Not yet - will be adding it today

dkoeplin avatar Oct 05 '17 18:10 dkoeplin