rf-detr
rf-detr copied to clipboard
Official ONNX Export Script for Dynamic Batching
Search before asking
- [x] I have searched the RF-DETR issues and found no similar feature requests.
Description
Dear RF-DETR Team,
First off, thank you for creating and maintaining this fantastic project. The RF-DETR Nano model is incredibly efficient and i am looking to use it in a multi-stream video analytics pipeline.
The Goal:
The goal is to convert the RF-DETR Nano model to a TensorRT engine that supports dynamic batch sizes. This is critical for maximizing GPU throughput in a real-time application that processes frames from multiple camera streams simultaneously.
The Problem:
The current default export method (model.export()) produces a static ONNX file with a hardcoded batch size of 1. When i tried to convert this to a TensorRT engine with dynamic shapes, trtexec fails as expected.
I spent considerable time trying to create a custom export script to produce a valid, dynamic ONNX file, but have run into a series of issues that suggest the model's architecture contains patterns that are incompatible with the standard torch.onnx.export tracing process when dynamic axes are used.
Summary of Attempts and Findings:
Initial Attempt (Dynamic Axes): I created a custom export script using torch.onnx.export with dynamic_axes set for the batch dimension. This failed during trtexec conversion with a reshape would change volume error, indicating an internal Reshape operation was using a hardcoded shape based on the trace-time batch size of 1.
Input Type Mismatch: We then discovered the model's forward pass expects a custom NestedTensor object, not a raw tensor. The tracer was failing because it was providing the wrong input type.
Wrapper Approach: I implemented the standard "ONNX Export Wrapper" pattern. This correctly constructs the NestedTensor from raw tensor inputs, satisfying the model's API. This successfully resolved the input type errors.
Unsupported Operator: The wrapper allowed the tracer to go deeper, where it found an unsupported operator aten::_upsample_bicubic2d_aa. I solved this by registering a custom symbolic function to map it to the standard ONNX Resize operator.
Final trtexec Error: Even with a seemingly perfect ONNX file, trtexec ultimately fails during parsing with an INVALID_NODE error on a ConvTranspose layer, stating that the number of input channels cannot be dynamic. We have also seen Concat layer errors where a cls_token with a static batch size of 1 is concatenated with feature maps that have a dynamic batch size.
Conclusion:
These findings strongly suggest that the RF-DETR architecture contains several operations (like ConvTranspose with dynamic channels or hardcoded constants like cls_token) that make it fundamentally incompatible with a straightforward dynamic batch export.
Feature Request:
Would it be possible for the development team to provide an official, supported Python script for exporting the RF-DETR models to ONNX with dynamic batch support that goes well with trtexec?
Having an official script would be a massive benefit to the community looking to deploy these models in high-performance, multi-stream environments. It would ensure that the known architectural hurdles are correctly handled during the export process.
Thank you for your consideration and for all your work on this project.
Use case
No response
Additional
No response
Are you willing to submit a PR?
- [ ] Yes I'd like to help by submitting a PR!
Not easy due to some internals of the model. Not impossible though. May need to implement anyway for some internal stuff
Btw the bicubic should not make it to the ONNX graph. There's some logic that happens in the optimize_for_inference that gets rid of it. Via the official export method it shouldn't be there either. Can you try those tools and see if it persists?
Unsupported Operator: The wrapper allowed the tracer to go deeper, where it found an unsupported operator aten::_upsample_bicubic2d_aa. I solved this by registering a custom symbolic function to map it to the standard ONNX Resize operator.
Just adding on top of that, I have tried exporting to onnx but have similar issue, solved it by changing it to "bilinear" as below:
patch_pos_embed = nn.functional.interpolate(
patch_pos_embed.to(dtype=torch.float32),
size=(torch_int(height), torch_int(width)), # Explicit size instead of scale_factor
mode="bilinear",
align_corners=False,
# antialias=True,
).to(dtype=target_dtype)
in the following directory: .venv\Lib\site-packages\rfdetr\models\backbone\dinov2_with_windowed_attn.py and .venv\Lib\site-packages\rfdetr\models\backbone\dinov2.py
not sure if that helps, just want to provide more information on this matter
Again, that layer shouldn't even be in the exported graph. It's only there to resize the positional embedding for different images resolutions, which is fixed at export time. Are you exporting the official way and seeing this layer present?
Thanks for the clarification. I tried to recreate the export using what I believe is the official method, do please let me know if I’ve misunderstood anything, I’m sharing only the relevant part of the code here:
print(f" Loading model from {checkpoint}")
model = RFDETRNano(pretrain_weights=checkpoint)
# Run export
print(f"Exporting to ONNX...")
model.export(
output_dir=MODEL_FOLDER,
simplify=SIMPLIFY,
shape=EXPORT_SHAPE,
verbose=True
)
However, I still ran into the following error:
Failed to export xxx : Exporting the operator 'aten::_upsample_bicubic2d_aa' to ONNX opset version 17 is not supported
Just wanted to check if this matches what you mean by the official export path, or if I may be missing a step.
Just a quick update, the export works fine after commenting the "antialias=True," or changing it to False.
patch_pos_embed = nn.functional.interpolate( patch_pos_embed.to(dtype=torch.float32), size=(torch_int(height), torch_int(width)), # Explicit size instead of scale_factor mode="bicubic", align_corners=False, # antialias=True, ).to(dtype=target_dtype)
Hope this helps
Not easy due to some internals of the model. Not impossible though. May need to implement anyway for some internal stuff
Btw the bicubic should not make it to the ONNX graph. There's some logic that happens in the optimize_for_inference that gets rid of it. Via the official export method it shouldn't be there either. Can you try those tools and see if it persists?
Thanks for the prompt reply. optimize_for_inference seems to get rid of the bicubic in the graph but still could not export out to trt with dynamic batches. As i read more and more about DETRs, i can see that architectural limitations do not allow a straight-forward dynamic batch export.
I hope we will see RF-DETR supporting dynamic batch export officially someday. Congrats on launching Segmentation models, good to see RF-DETR evolving into a proper computer vision suite.
If appropriate, is there a proper way to bump this issue for properly running RF-DETR with TensorRT? If anyone has any resources I should look at please let me know.