shadjis
shadjis
[problem_app2.txt](https://github.com/stanford-ppl/spatial/files/2247324/problem_app2.txt) Maybe not an issue but attached simple example app has 2 parallel pipelines with 56 iterations each, one takes 68 cycles the other 69 cycles. I am wondering if...
Initially this can be round robin allocation, which is probably good enough for apps with many load/store channels In the future partitioning of data structures can also be implemented
Rather than use separate DSP blocks for multiplies and then adds, the Xilinx MACC IP can be used. This would reduce DSP utilization.
URAMs are 72 bits wide, which means that 2 32-bit words can fit in each entry (or 4 16-bit words, etc.) This can be done in the case of sequential...
The F1 has UltraRAMs which can be used for larger SRAMs. However, SRAMs need to be explicitly assigned to URAMs using the following syntax: `(* ram_style = "ultra" *) reg...
Example stride syntax: **LB:** val lb = LineBuffer[T](5, 28, s) **SR:** sr(r, *).shift2 ( lb(r, c::c+s) ) { y => y } An alternative syntax we discussed for SR was...
It would be nice to have a function similar to numpy.fromfile, e.g. ``` ReadTensor(file_path, datatype, dims) ``` Where the file is in binary format, e.g. 4 byte float, 1 byte...