calyx icon indicating copy to clipboard operation
calyx copied to clipboard

Try out Halide to Calyx flow

Open rachitnigam opened this issue 2 years ago • 10 comments
trafficstars

@xerpi has been working a Halide (to MLIR) to Calyx flow and we should give it a try and get a sense of what we need to do to support it.

Some information from on this (courtesy @xerpi):

  • CIRCT branch: https://github.com/xerpi/circt/commits/dev/xerpi/scf-to-calyx-vector-types
  • Halide Backend changes: https://github.com/halide/Halide/pull/7587
  • Halide XCLbin runtime: https://github.com/halide/Halide/pull/7668

Other refs:

  • https://github.com/xerpi/Halide/blob/initial-mlir-codegen/src/CodeGen_MLIR.cpp#L343
  • https://github.com/xerpi/Halide/blob/initial-mlir-codegen/src/CodeGen_MLIR.cpp#L469

rachitnigam avatar Jul 05 '23 19:07 rachitnigam

Some instructions to get started:

  1. First, install the MLIR libraries: either compile LLVM with MLIR enabled (-DLLVM_ENABLE_PROJECTS="mlir"), or install it from your distro repositories (in Fedora it's the mlir-devel package).

  2. Build Halide: https://github.com/halide/Halide#building-halide-with-cmake

-- Using LLVMConfig.cmake in: /usr/lib64/cmake/llvm
-- Using ClangConfig.cmake in: /usr/lib64/cmake/clang
-- Using MLIRConfig.cmake in: /usr/lib64/cmake/mlir
...

Check that MLIRConfig.cmake is found when running cmake.
In my case, I also had to pass -DTARGET_WEBASSEMBLY:BOOL=OFF to CMake otherwise it was complaining about shared LLVM libs.
I also recommend passing -DHalide_ENABLE_EXCEPTIONS:BOOL=OFF to get the errors printed.

  1. Currently, generating MLIR code is done manually by calling Func::compile_to_mlir. So you have to edit Halide programs and change the realize() method (which compiles and runs on the host) with a call to compile_to_mlir

xerpi avatar Jul 07 '23 12:07 xerpi

Thanks for adding the information @xerpi!

rachitnigam avatar Jul 07 '23 13:07 rachitnigam

I think this is the dissertation that describes the flow (I think): https://upcommons.upc.edu/bitstream/handle/2117/390390/176860.pdf?sequence=2

rachitnigam avatar Jul 08 '23 01:07 rachitnigam

I think this is the dissertation that describes the flow (I think): https://upcommons.upc.edu/bitstream/handle/2117/390390/176860.pdf?sequence=2

Indeed, that was my thesis dissertation for the project. The "interesting" part starts in "Chapter 4 - Methodology".

Here's a summary of the flow:

High-level overview of the implemented flow from Halide down to execution on Xilinx FPGAs: image

Passes executed to convert from generic MLIR to CIRCT’s hardware dialects (after the MLIR to Calyx step, human-readable Calyx code can be emitted): image

Steps needed to export SystemVerilog targeting Xilinx devices from CIRCT with hardware dialects. First, the needed Xilinx-specific wrappers are added, and then the passes to convert to SystemVerilog are executed. A kernel.xml file needed by Vitis v++ is also generated: image

xerpi avatar Jul 09 '23 08:07 xerpi

Oh interesting! Did you ever use the native compiler to perform any optimizations to the design? I wonder how the resulting designs would differ.

rachitnigam avatar Jul 09 '23 13:07 rachitnigam

Oh interesting! Did you ever use the native compiler to perform any optimizations to the design? I wonder how the resulting designs would differ.

At the beginning of development, I indeed emitted human-readable Calyx code and used the native Calyx compiler to check that code produced was at least semantically correct. By that time, I still had to implement Halide's XRT runtime backend so I didn't do any performance comparisons (or even checked visually the difference of the RTL code emitted).

Soon after that, I added custom Calyx operations to support vector types and since then unfortunately Calyx code can't be emitted anymore.

xerpi avatar Jul 10 '23 02:07 xerpi

Ah got it! What are the custom vector operations that you ended up adding to Calyx. Maybe we can figure out a way to support them natively.

rachitnigam avatar Jul 10 '23 12:07 rachitnigam

Ah got it! What are the custom vector operations that you ended up adding to Calyx. Maybe we can figure out a way to support them natively.

To implement the Halide vector Broadcast (also called vector splat) operation, I added an equivalent node to Calyx: https://github.com/xerpi/circt/commit/53d5f7dcc115355797fc89b8c929706924ae943d#diff-257f12e7264e222a495e648498b969968a86ac197236df0433a65335da1509bf

I also had to remove the constraint that ensures that a Calyx assign src and dst wires are of the same type, so that I could flatten/unflatten vectors to just a range of bits and vice-versa (needed for example when reading from the memory bus into a vector, and vice versa): https://github.com/xerpi/circt/commit/53d5f7dcc115355797fc89b8c929706924ae943d#diff-129515b0cbde7eccbd6943c9b6f45d597ff0fcc25df9f45b4da60e804815e8c0

I also had to add support for sequential-read memories (extra "read-en" signal): https://github.com/llvm/circt/pull/4857

xerpi avatar Jul 11 '23 05:07 xerpi

Bringing back this discussion after almost a year :)

I'm working on vectors stuff and have relevant questions regarding vector operations.

To implement the Halide vector Broadcast (also called vector splat) operation, I added an equivalent node to Calyx: https://github.com/xerpi/circt/commit/53d5f7dcc115355797fc89b8c929706924ae943d#diff-257f12e7264e222a495e648498b969968a86ac197236df0433a65335da1509bf

@xerpi Did you get a chance to implement vector additions and having vectors as return types, especially that Calyx doesn't have vector registers?

jiahanxie353 avatar Apr 16 '24 21:04 jiahanxie353

@jiahanxie353 Hi! I didn't get a chance to implement it properly... Iirc what I did what to cast a flat bit array of N bits to a vector with K lanes (N/K bits per lane) and viceversa, implicitly. The code was a horrible hack and I hope I had more time to implement something cleaner. I think you can find it here: https://github.com/xerpi/circt/commits/dev/xerpi/scf-to-calyx-vector-types/

xerpi avatar Apr 16 '24 23:04 xerpi