calyx
calyx copied to clipboard
[CIRCT] Lowering high level MLIR programs to Calyx.
This documents progress on lowering a high level program to native* Calyx and then through Cider. This should serve as a useful reference for those not entirely familiar with the MLIR workflow or the Calyx workflow. The high-level conversion path is nicely illustrated by the following infographic [1]:

PyTorch -> Linalg -> Affine -> StaticLogic (CIRCT) -> Calyx (CIRCT) -> Calyx (Native).
* "Native" Calyx is the intermediate representation as implemented in this repository [2]. The other representation is the Calyx dialect, which follows MLIR standards. There exists emitters to go between the native form and the dialect.
Requirements:
- Install Torch-MLIR snapshot
- Install CIRCT (which in turn requires MLIR)
- Install Calyx, fud, Cider
Lowering
- PyTorch -> Linalg
torch-mlir-opt <%s -convert-torch-to-linalg
- Linalg -> Affine
# TODO(cgyurgyik): Verify with PyTorch community this is still the bufferization method used.
mlir-opt %s --linalg-comprehensive-module-bufferize='allow-return-memref use-alloca' --convert-linalg-to-affine-loops
- Affine -> Static Logic
circt-opt %s -convert-affine-to-staticlogic
- StaticLogic -> Calyx (CIRCT)
circt-opt %s --lower-scf-to-calyx='cider-source-location-metadata' -canonicalize
- Calyx (CIRCT) -> Calyx (Native)
circt-translate %s --export-calyx
- Cider
fud e --to debugger %s
TODO(cgyurgyik): Verify with PyTorch community this is still the bufferization method used.
FYI this is definitely not the bufferization method they use, this is a newer method I wanted to try out in hlstool. The hlstool command you've pasted here is actually out of date already by the looks of it. If you are running into bufferization issues, I can try to help use the method torch-mlir uses, or adopt the latest and greatest.
I must have misunderstood our chat offline. At this point, I'm just trying to lower something that the StaticLogic will nicely digest.
Hey @cgyurgyik, what is this issue currently tracking? I don't see any specific tasks on it so just wondering if everything has been done.
So I've been working on lowering Affine -> Calyx native.
First, a couple bugs with a hacky fix:
- https://github.com/llvm/circt/issues/3111
- https://github.com/llvm/circt/issues/3112
Trying to run anything a bit more complex than the tests provided in the CIRCT suite has led to compiler crashes, and more recently, bits being removed by std_slice which wasn't caught by the interpreter because this is sometimes expected behavior. Instead of trying to introduce more brittle fixes, I've decided the next best step for longevity is to finally separate the two passes: https://github.com/llvm/circt/issues/2988 (tracker). @mortbopet has also suggested the following clean-up for the StaticLogic side: https://github.com/llvm/circt/issues/3113. However, we still need to finish the separation clean-up (WIP).
Part of this issue is placing all the necessary steps in one location. Yes, StaticLogic -> Calyx native is possible. It just isn't very robust currently.
Part of this issue is placing all the necessary steps in one location.
For the record, this flow does exist in one place, wrapped up by a Python script here: https://github.com/circt-hls/circt-hls/tree/main/tools/hlstool.
I'd be curious to take a look if you have more examples of failures, the pass should emit an error instead of crashing.
This one seems to be an assumption I need to relax: https://github.com/llvm/circt/issues/3127.
The big TODO I know about for StaticLogic -> Calyx is nested loops and there are at least two ways to skin this cat: https://github.com/llvm/circt/issues/2320 https://github.com/llvm/circt/issues/2659
Part of this issue is placing all the necessary steps in one location.
For the record, this flow does exist in one place, wrapped up by a Python script here: https://github.com/circt-hls/circt-hls/tree/main/tools/hlstool.
I haven't used hlstool, but it definitely looks like something good to have in the workflow. My goal here was to provide minimal commands.
I'd be curious to take a look if you have more examples of failures, the pass should emit an error instead of crashing.
Yeah I can bring up more examples, but honestly it wasn't hard for me to cause the compiler to crash / produce semantically incorrect code (e.g., slicing bits that shouldn't be sliced) - this is without any loop nesting. After the initial hacky fix, it seemed like the best next step was to separate SCF and StaticLogic code as an intermediary step.
Then, we can re-produce these bugs and hopefully write better fixes for them that don't require reasoning about two conversions. Curiously, do you have examples of bigger programs you've successfully lowered to Calyx native with desired behavior?
Curiously, do you have examples of bigger programs you've successfully lowered to Calyx native with desired behavior?
Nothing more complex than a single loop nest, like what is checked into CIRCT's tests.
Yeah I can bring up more examples, but honestly it wasn't hard for me to cause the compiler to crash / produce semantically incorrect code (e.g., slicing bits that shouldn't be sliced) - this is without any loop nesting.
Please file issues on CIRCT. No one other than myself has tested this flow, and I've only tested the simplest meaningful example I could write: a dot product. This is very experimental work in progress.
@cgyurgyik are there action items for the this tracker that we should be working on? If not, let's wrap up the issue