burn icon indicating copy to clipboard operation
burn copied to clipboard

ONNX models with import or runtime issues

Open antimora opened this issue 4 months ago • 37 comments

Tracking: Models That Fail to Import or Run in Burn

This issue tracks ONNX models that cannot currently be onnx converted, rs built, have runtime problems or output accuracies using burn-import. If you encounter a model that fails to import or execute, please comment below or submit a new issue and reference this tracker.

Checklist of Models with Known Issues (working if checked)

Natural Language Processing (NLP)

  • [x] ALBERT/BERT models (#1811)
  • [x] all-MiniLM-L6-v2 (#600)
  • [x] ModernBERT-base (#3130)
  • [ ] IBM Granite 4.0 Tiny Preview (models#71)

Multimodal (Vision-Language)

Object Detection

Depth Estimation

Generative Models

Audio/Speech

Computer Vision - General

Optical Flow & Pose Estimation

How to Use

  • Add a comment if you find a new failing model, or if your model is fixed by a PR.
  • Check off models as they become supported or fixed.
  • Reference this issue when creating new ONNX import failure reports.

For operator-level or feature gaps, please also check:

antimora avatar Jul 23 '25 01:07 antimora

Quick update I am able to convert yolo11x_opset16.onnx using this WIP PR: #3381

antimora avatar Jul 24 '25 03:07 antimora

CLIP ViT-B-32 is buildable with https://github.com/tracel-ai/burn/pull/3560 (still under review)

antimora avatar Aug 16 '25 15:08 antimora

I'll be working on a test harness to test various large models quickly.

antimora avatar Aug 16 '25 15:08 antimora

RTMW3D-x is buildable with https://github.com/tracel-ai/burn/pull/3564 (still under review)

antimora avatar Aug 17 '25 05:08 antimora

i am able to import https://huggingface.co/Xenova/albert-large-v2/resolve/main/onnx/model.onnx but rust code has type errors

n1ght-hunter avatar Aug 18 '25 06:08 n1ght-hunter

in rust,shou error: === Tensor Operation Error === Operation: 'Reshape' Reason: 1. The given shape doesn't have the same number of elements as the current tensor. Current shape: [1035], target shape: [1, 1034].

program in python, def init(self, input_dim=1035): super(Net, self).init() self.attention = ChannelAttention(input_dim) self.fc1 = nn.Linear(input_dim, 512) self.fc2 = nn.Linear(512, 256) self.fc3 = nn.Linear(256, 128) self.fc4 = nn.Linear(128, 2) self.relu = nn.ReLU()

torch.onnx.export(model, 
                dummy_input, 
                'd:/archicad/model_data/f_best_2.onnx',
                export_params=True, 
                opset_version=12, 
                do_constant_folding=True,
                input_names=['input'], 
                output_names=['output'],
                dynamic_axes={})

have use uv run --script https://raw.githubusercontent.com/tracel-ai/burn/refs/heads/main/crates/burn-import/onnx_opset_upgrade.py change to opset version 16

coolstudio1678 avatar Aug 18 '25 12:08 coolstudio1678

i am able to import https://huggingface.co/Xenova/albert-large-v2/resolve/main/onnx/model.onnx but rust code has type errors

I have 3 outstanding ONNX related PRs (#3563, #3564, #3550) with fixes. Most likely it's caused by this: #3564.

Hopefully @laggui will have some time to review ;-)

antimora avatar Aug 18 '25 14:08 antimora

Resnet is buildable:

Image

antimora avatar Aug 18 '25 23:08 antimora

can we add kokoro to this. as its a very popular tts model. currently has an issue with expand1 - rank which is an already known issue. it also uses albert internally so will probably require albert model to be working for this to work https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/kokoro-v1.0.onnx there is also the bin file for voices https://github.com/thewh1teagle/kokoro-onnx/releases/download/model-files-v1.0/voices-v1.0.bin

n1ght-hunter avatar Aug 19 '25 03:08 n1ght-hunter

Even though Yolo11X ONNX file can be converted into rust code, currently the generated rust code can't be built to due to not handing broadcasting (see why automatic full broadcasting is lacking in Burn https://github.com/tracel-ai/burn/issues/1499). I have fixed this broadcasting issue here: https://github.com/tracel-ai/burn/pull/3589

Afterwards, there is another runtime issue with slice that I need to investigate.

antimora avatar Aug 21 '25 01:08 antimora

Is failing to import due to a scalar input in ConstantOfShape https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/vae_decoder/model.onnx

  ERROR burn_import::logger: PANIC => panicked at C:\Users\Danilo\.cargo\registry\src\index.crates.io-1949cf8c6b5b557f\onnx-ir-0.18.0\src\node\constant_of_shape.rs:34:18:
  ConstantOfShape node must have a Tensor with a non-empty static shape value

attribute in netron

tensor: float32[1]
[
    0
]

notdanilo avatar Aug 21 '25 12:08 notdanilo

I have added a harness to test models: https://github.com/tracel-ai/burn/tree/main/crates/burn-import/model-checks

YOLOx11 is passing for tch and ndarray backends but currently it's failing on metal due to: https://github.com/tracel-ai/burn/issues/3600 bug in metal.

I am working on clip-vit-b-32-text next. Locally it's passing (with 5-6 fixes in burn-import related to broadcasting and other issues). I have identified one bug in ndarray backend related to int lower open. I will submit a PR shortly and a bug report for ndarray (if I can't fix it).

antimora avatar Aug 25 '25 15:08 antimora

Recently had issues with Apple Depth Pro , could it be added? It has a script for weights.

Another one of interest to me is LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias also with a list of checkpoints, also with issues importing in Burn (on a recent main).

torsteingrindvik avatar Aug 25 '25 18:08 torsteingrindvik

https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/vae_encoder/model.onnx

Failing to load with

  Slice: steps other than 1 are not supported

step is -1

notdanilo avatar Aug 26 '25 22:08 notdanilo

https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/vae_encoder/model.onnx

Failing to load with

  Slice: steps other than 1 are not supported

step is -1

With the latest branch, I have the same issue with yolor, looking forward to fully utilize burn-import!!!

extraymond avatar Aug 27 '25 06:08 extraymond

The slice limitation currently comes from Burn's tensor.slice(...) which doesn't accept step != 1 for ranges. We should improve this.

laggui avatar Aug 27 '25 11:08 laggui

I will check what exactly -1 step means. If it inverts the slice or if it simply changes the index relative position/orientation

notdanilo avatar Aug 27 '25 12:08 notdanilo

CLIP ViT-B/32 text model ONNX:

  1. Convertable to rust
  2. Buildable
  3. Runnable
  4. Accurate compared with ONNX Runtime outputs
Image

The PR is under the review: https://github.com/tracel-ai/burn/pull/3623

antimora avatar Aug 29 '25 00:08 antimora

CLIP ViT-B-32 fixes are merged. tch, metal and ndarray backends work.

To test:

cd crates/burn-import/model-checks/clip-vit-b-32-text
./get_model.py #or uv run ./get_model.py or python ./get_model.py

cargo run --release # tch default

#ndarray
cargo run --release --no-default-features --features ndarray

#metal
cargo run --release --no-default-features --features metal

antimora avatar Sep 02 '25 17:09 antimora

I can confirm Retinaface works fine now too

Arcface can be converted, but panics at runtime with the metal backend, probably some of the same issues as in #3635 https://huggingface.co/FoivosPar/Arc2Face/blob/main/arcface.onnx

Yolo11x with static shape works, but with dynamic shape fails to import (python from ultralytics import YOLO ; YOLO("yolo11x.pt").export(format="onnx", dynamic=True) )

AdrianEddy avatar Sep 04 '25 02:09 AdrianEddy

I can confirm Retinaface works fine now too

Arcface can be converted, but panics at runtime with the metal backend, probably some of the same issues as in #3635 https://huggingface.co/FoivosPar/Arc2Face/blob/main/arcface.onnx

Yolo11x with static shape works, but with dynamic shape fails to import (python from ultralytics import YOLO ; YOLO("yolo11x.pt").export(format="onnx", dynamic=True) )

Thanks for reporting

antimora avatar Sep 04 '25 02:09 antimora

clip-ViT-B-32 Vision can be converted, but the converted code fails to compile due to incompatible tensor dims

error[E0308]: mismatched types
   --> clip_vision.rs:835:42
    |
835 |         let add1_out1 = concat3_out1.add(gather2_out1);
    |                                      --- ^^^^^^^^^^^^ expected `3`, found `2`
    |                                      |
    |                                      arguments to this method are incorrect
    |
    = note: expected struct `burn::tensor::Tensor<_, 3>`
               found struct `burn::tensor::Tensor<_, 2>`

error[E0308]: mismatched types
    --> clip_vision.rs:3259:9
     |
698  |     pub fn forward(&self, input1: Tensor<B, 4>) -> Tensor<B, 2> {
     |                                                    ------------ expected `burn::tensor::Tensor<B, 2>` because of return type
...
3259 |         div27_out1
     |         ^^^^^^^^^^ expected `2`, found `3`
     |
     = note: expected struct `burn::tensor::Tensor<_, 2>`
                found struct `burn::tensor::Tensor<_, 3>`

AdrianEddy avatar Sep 05 '25 23:09 AdrianEddy

clip-ViT-B-32 Vision can be converted, but the converted code fails to compile due to incompatible tensor dims

error[E0308]: mismatched types
   --> clip_vision.rs:835:42
    |
835 |         let add1_out1 = concat3_out1.add(gather2_out1);
    |                                      --- ^^^^^^^^^^^^ expected `3`, found `2`
    |                                      |
    |                                      arguments to this method are incorrect
    |
    = note: expected struct `burn::tensor::Tensor<_, 3>`
               found struct `burn::tensor::Tensor<_, 2>`

error[E0308]: mismatched types
    --> clip_vision.rs:3259:9
     |
698  |     pub fn forward(&self, input1: Tensor<B, 4>) -> Tensor<B, 2> {
     |                                                    ------------ expected `burn::tensor::Tensor<B, 2>` because of return type
...
3259 |         div27_out1
     |         ^^^^^^^^^^ expected `2`, found `3`
     |
     = note: expected struct `burn::tensor::Tensor<_, 2>`
                found struct `burn::tensor::Tensor<_, 3>`

Fixed: https://github.com/tracel-ai/burn/pull/3673

The issue was in Gather operator.

antimora avatar Sep 06 '25 23:09 antimora

@AdrianEddy #3673 is merged. It also addresses your concert regarding: indices = Tensor::<B, 1, _>::from_data

In the PR fix (#3673), the indices are loaded from weights file now. Constant indices are preserved and not converted to static values unless the indices used for Shape input gather. This way no copying back end forward.

antimora avatar Sep 08 '25 15:09 antimora

I can confirm these models run as expected for me now:

  • CLIP ViT-B-32 (Text)
  • CLIP ViT-B-32 (Vision)
  • RetinaFace-resnet50
  • Arcface
  • Yolo11m (static shape)

I'm super happy with this and grateful for all your hard work, now I can replace ort with Burn entirely in my app!

Fun fact: Burn is 8x faster on macOS than onnxruntime with these models

AdrianEddy avatar Sep 11 '25 13:09 AdrianEddy

With the PR #3736, the facenet model run as expected.

A2va avatar Sep 17 '25 07:09 A2va

Yolov8n works https://github.com/tracel-ai/burn/pull/3750

antimora avatar Sep 21 '25 02:09 antimora

YOLOv10 model requires scalar topK input and Mod op.

antimora avatar Sep 21 '25 20:09 antimora

albert/albert-base-v2 model works with this PR fix: https://github.com/tracel-ai/burn/pull/3810

For tch backend on M3 Mac with params: 89,650,188:

========================================
ALBERT Base v2 Burn Model Test
========================================

Initializing ALBERT Base v2 model...
  Model initialized in 86.22ms

Saving model structure to artifacts/albert-base-v2_model.txt...
  Model structure saved

Loading test data from artifacts/albert-base-v2_test_data.pt...
  Data loaded in 1.12ms
  Loaded input tensors:
    input_ids shape: [1, 128]
    attention_mask shape: [1, 128]
    token_type_ids shape: [1, 128]
  Loaded reference outputs:
    last_hidden_state shape: [1, 128, 768]
    pooler_output shape: [1, 768]

Running model inference with test input...
  Inference completed in 33.69ms

  Model output shapes:
    output 0 (last_hidden_state): [1, 128, 768]
    output 1 (pooler_output): [1, 768]

Comparing model outputs with reference data...
  Checking last_hidden_state...
    ✓ last_hidden_state matches reference data within tolerance (1e-4)!
  Checking pooler_output...
    ✓ pooler_output matches reference data within tolerance (1e-4)!

========================================
Model test completed!
========================================

with Ndarray backend:

========================================
ALBERT Base v2 Burn Model Test
========================================

Initializing ALBERT Base v2 model...
  Model initialized in 59.57ms

Saving model structure to artifacts/albert-base-v2_model.txt...
  Model structure saved

Loading test data from artifacts/albert-base-v2_test_data.pt...
  Data loaded in 502.13µs
  Loaded input tensors:
    input_ids shape: [1, 128]
    attention_mask shape: [1, 128]
    token_type_ids shape: [1, 128]
  Loaded reference outputs:
    last_hidden_state shape: [1, 128, 768]
    pooler_output shape: [1, 768]

Running model inference with test input...
  Inference completed in 155.25ms

  Model output shapes:
    output 0 (last_hidden_state): [1, 128, 768]
    output 1 (pooler_output): [1, 768]

Comparing model outputs with reference data...
  Checking last_hidden_state...
    ✓ last_hidden_state matches reference data within tolerance (1e-4)!
  Checking pooler_output...
    ✓ pooler_output matches reference data within tolerance (1e-4)!

========================================
Model test completed!
========================================

antimora avatar Sep 30 '25 18:09 antimora

There is a current limitation in burn-import implementation for large ONNX files (> 2GB). See https://github.com/tracel-ai/burn/issues/3812

That's why stable-diffusion-xl-base-1.0 can't be supported at the moment.

antimora avatar Sep 30 '25 22:09 antimora