feat: use MLIR resources to capture large constants in the IR

Open asraa opened this issue 9 months ago • 1 comments

https://github.com/google/heir/pull/1501 adds short-term support for serializing large constants at code emission time for the openfhe backend. But to support this in a more agnostic way and throughout the compilation process, it would be a good idea to use MLIR dialect resource blobs to handle large constants that slow compilation times.

@ZenithalHourlyRate laid out a good way to support this earlier in the pipeline (rather than just at one backend's code emission)

[ ] A pass to transform arith.constant dense<blob_string> into arith.constant dense_resource<resource_key> and a trailing resource section in the file. And optionally split the IR and resource into two file, because MLIR allows a dense_resource without the referenced value in the IR.
[ ] linalg-to-tensor-ext is provided with the resource file to operate. Some passes may also fold or canonicalize tensor operations (e.g. loop unroll?)
[ ] Code emission to a particular backend on the split resource file should be done by other tools, saying mlirResourceToCereal/mlirResourceToMlirByteCode
[ ] We can support importing constants in a OnnxToMlirResource-ish way.

Mar 10 '25 14:03 asraa

I wanted to revisit this since torch-mlir actually outputs dense resources for the large constants in the IR.

For large models, we know we may want to split the model weight processing into a separate function. I'm still not sure what format we want to use to serialize the weights to disk though. For the torch C++ examples, we used the torch library and the user would load serialized torch tensors.

Nov 03 '25 20:11 asraa