altius
altius copied to clipboard
Small ONNX inference runtime written in Rust
- [x] initial support (model: https://huggingface.co/timm/fastvit_s12.apple_in1k) - [ ] improve performance
- Create `struct GraphModifier` or something like that, to easily implement graph modification code.
- [ ] MatMul+Add fusion - MatMul+Add → Gemm -  - [x] Reshape+Transpose fusion -  - [ ] **Automatic fusion for element-wise operations**
Supported kernels: - [x] LeakyRelu - [x] Resize - [x] Concat - [x] Transpose - [x] Squeeze - [x] Div - [ ] ReduceMin - [ ] Round - [x]...
Backend - [ ] Interpreter - [ ] CPU
- [ ] https://github.com/maekawatoshiki/altius/blob/efd8fd3949cb200637f916186c525c454d37e633/crates/altius-py/tests/test_ops_elemwise.py#L165
- [ ] Fuse nodes into [`com.microsoft.MultiHeadAttention`](https://github.com/microsoft/onnxruntime/blob/main/docs/ContribOperators.md#commicrosoftmultiheadattention) - Python script here: https://github.com/maekawatoshiki/altius/blob/0dcb1e666342a169919d80999f3c642db270aded/crates/altius-py/fuse_attn.py - [ ] Add kernel for it