calyx
calyx copied to clipboard
Pipeline multiply and other primitives
This came up over here: https://github.com/cucapra/calyx/pull/909#discussion_r811391088. The multiplier and some of the other non-trivial primitives use multiple stages internally, and it should be possible to pipeline them with II=1. Now that we are starting to push on statically timed pipelines this could yield some nice improvements.
Outlining an implementation sketch for anyone interested in implementing this. The current [std_mult_pipe][smp] is actually already pipelined internally so that Xilinx tools infer them as DSPs. However, the interface does not expose any way to use the multiplier in a pipelined way because it only implements a one-sided ready-valid interface. To make this work, we can start off with the primitive and expose a proper, double-sided read-valid interface. This new primitive should probably be exposed in a separate library file. Next, we can change the code generation for SCF programs to explicitly model reads and writes to/from the multiplier when generating the pipeline so that they can be pushed into the right stages.
One idea for doing this is using the primitives generated from Filament designs and exposing two ports: One that allow access to the pipelined execution and another one that registers the output. If a design doesn't use the latter, it'll get optimized into a purely pipelined operator
Once we wrap up #1725, we can start working on this. The new interface is flexible enough to allow for modules to be pipelined.