calyx
calyx copied to clipboard
Idea to speed up the execution of Calyx designs through Verilator
We have already seen that manually separating Calyx files with a seq of (for example) 20 invokes, into 20 separate Calyx files with one invoke each, can dramatically speed up the execution of large Calyx files through verilator.
Based on @rachitnigam's idea (we talked yesterday but I'm not positive I'm understanding things correctly), we can build a basic compiler that lowers a Calyx program in the following way:
- We individually run the different Calyx components through verilator, which will lower the Calyx-generated Verilog into C++. So we will have separate C++ modules that represent Calyx invokes.
- We lower the control flow of a Calyx program into C++ code.
- We somehow find a way to use the Verilator-generated C++ modules in the C++ control-flow code we generated in 2) to simulate the Calyx program.
Yup, you understood the idea perfectly! This is something @EclecticGriffin and I have also discussed before
We have already seen that manually separating Calyx files with a seq of (for example) 20 invokes, into 20 separate Calyx files with one invoke each, can dramatically speed up the execution of large Calyx files through verilator.
Why?
Because most of the circuit is not doing any useful computation most of the time; the generated C++ is just computing it every cycle and throwing it away. If you just remove that computation entirely, then you'll see speed ups in the useful parts of the circuit