dinov2 icon indicating copy to clipboard operation
dinov2 copied to clipboard

How to convert DINOv2 to ONNX?

Open PeterKim1 opened this issue 9 months ago • 9 comments

Hi. Thanks for your great works.

I want to convert dinov2 to onnx, but failed.

I try to refer https://github.com/facebookresearch/dinov2/issues/19 this Issue.

I apply https://github.com/facebookresearch/dinov2/issues/19#issuecomment-1514310057 this, after that, https://github.com/facebookresearch/dinov2/issues/19#issuecomment-1514398389 this error occur.

So, I try to apply https://github.com/facebookresearch/dinov2/issues/19#issuecomment-1514423145 this, but error still occur.

Are there any guidelines for onnx converting?

I need to use this model quickly for semantic segmentation tasks.

Thanks.

PeterKim1 avatar Sep 15 '23 04:09 PeterKim1

changing this bit in vision_transformer.py on line 187 will allow export.

patch_pos_embed = nn.functional.interpolate(
    patch_pos_embed.reshape(1, int(math.sqrt(N)), int(math.sqrt(N)), dim).permute(0, 3, 1, 2),
    scale_factor=(float(w0 / math.sqrt(N)), float(h0 / math.sqrt(N))),
    mode="bicubic",
)

seddonm1 avatar Sep 21 '23 06:09 seddonm1

thanks @seddonm1 fort the work around, it works for me for batch size 1. However, I am trying to have dynamic batch size, i am able to convert the model to ONNX with dynamic batch size, but when I load it I get error, anyone managed?

dacquaviva avatar Sep 25 '23 13:09 dacquaviva

I've exported class token + patch tokens: https://github.com/facebookresearch/dinov2/issues/167#issuecomment-2030326350

barbolo avatar Apr 01 '24 18:04 barbolo

I've tried to export to ONNX using dynamic input and output shapes. The model is exported and seems fine, however the ONNX model throws an exception during inference when the input is not the same as the input sample fed during the export.

For example, when I export a model with an input with shape [1, 3, 168, 168] (batch_size x C x H x W) the last hidden state (class token + patch token) has 145 features. When I try to use that model with an input with shape [1, 3, 112, 112 ] (which should output 65 features), the following exception is thrown:

[E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Add node. Name:'/embeddings/Add' Status Message: /Users/runner/work/1/s/onnxruntime/core/providers/cpu/math/element_wise_ops.h:560 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 65 by 145

barbolo avatar Apr 06 '24 23:04 barbolo

I've tried to export to ONNX using dynamic input and output shapes. The model is exported and seems fine, however the ONNX model throws an exception during inference when the input is not the same as the input sample fed during the export.

For example, when I export a model with an input with shape [1, 3, 168, 168] (batch_size x C x H x W) the last hidden state (class token + patch token) has 145 features. When I try to use that model with an input with shape [1, 3, 112, 112 ] (which should output 65 features), the following exception is thrown:

[E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Add node. Name:'/embeddings/Add' Status Message: /Users/runner/work/1/s/onnxruntime/core/providers/cpu/math/element_wise_ops.h:560 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 65 by 145

@barbolo , hello, I got the same error. Have you figured out how to solve this porblem?

WulongGuo avatar Apr 24 '24 01:04 WulongGuo

@WulongGuo no, I haven't. And I'm not sure there is a solution. I've seen other ViT like repositories with downloadable ONNX/OpenVINO models and all of them have fixed input shapes.

For my use case, I'm interested in reducing the inference time, so I've exported the model for the input shapes I'm using and I'm loading them in memory. This approach uses more memory, but the inference time is optimized.

barbolo avatar Apr 24 '24 12:04 barbolo

@barbolo ok, thanks for your reply. I would just use the fixed-input version.

WulongGuo avatar Apr 25 '24 01:04 WulongGuo

I've tried to export to ONNX using dynamic input and output shapes. The model is exported and seems fine, however the ONNX model throws an exception during inference when the input is not the same as the input sample fed during the export.

For example, when I export a model with an input with shape [1, 3, 168, 168] (batch_size x C x H x W) the last hidden state (class token + patch token) has 145 features. When I try to use that model with an input with shape [1, 3, 112, 112 ] (which should output 65 features), the following exception is thrown:

[E:onnxruntime:, sequential_executor.cc:514 ExecuteKernel] Non-zero status code returned while running Add node. Name:'/embeddings/Add' Status Message: /Users/runner/work/1/s/onnxruntime/core/providers/cpu/math/element_wise_ops.h:560 void onnxruntime::BroadcastIterator::Append(ptrdiff_t, ptrdiff_t) axis == 1 || axis == largest was false. Attempting to broadcast an axis by a dimension other than 1. 65 by 145

i also met this problem, but the model can inference with difference input shapes in python before exported, but when i exported the onnx model, the onnx model only can inference on the input that have same w&h, can't inference on other shapes

Zalways avatar Apr 29 '24 02:04 Zalways

i'll appreciate if you could solve this problem

Zalways avatar Apr 29 '24 02:04 Zalways