rf-detr Severe accuracy drop when using optimize_for

Severe accuracy drop when using optimize_for_inference(dtype=torch.float16)

Open LilG-Cheng-Ju opened this issue 3 months ago • 5 comments

trafficstars

Search before asking

[x] I have searched the RF-DETR issues and found no similar bug report.

Bug

Hi, thanks for your great work on RFDETR!

I encountered a severe accuracy drop when running inference with FP16.
Here are the details:

Tested models:
- Official RFDETRNano
- Official RFDETRSmall
- My custom-trained RFDETRNano (trained on my own dataset)
Code example:

from rfdetr import RFDETRNano
import torch

model = RFDETRNano()
model.optimize_for_inference(dtype=torch.float16)

# Run inference...

Once I apply optimize_for_inference(dtype=torch.float16), the inference accuracy drops significantly compared to FP32. This happens across both official pretrained models and my own trained model. Could you confirm if FP16 inference is supported, or if additional steps are needed to maintain accuracy?

Thanks!

Environment

RF-DETR 1.2.0
OS: both ubuntu 22.04 and Windows 11
Python 3.10
torch 2.7.0
CUDA 12.4
GPU both RTX 3060 and RTX A5000

Minimal Reproducible Example

image = Image.open('demo.jpg')
model = RFDETRNano()
model.optimize_for_inference(dtype=torch.float16)
detections = model.predict(image, conf=0.2)

Additional

I also tried exporting the model to TorchScript and compared the accuracy between

using .half() before export
and without .half()

In both cases, I observed the same issue: inference accuracy drops significantly when running in FP16.

Are you willing to submit a PR?

[ ] Yes, I'd like to help by submitting a PR!

Aug 19 '25 02:08 LilG-Cheng-Ju

Same here, I used a custom trained RFDETRMedium.

Aug 20 '25 06:08 westlinkin

We're aware of the issue, same thing that causes degradation in TRT which is resolved by this approach https://github.com/roboflow/rf-detr/issues/176 but I guess that isn't relevant for torchscript

Models trained on and exported from roboflow platform SHOULD work out of the box in fp16 with no decay as we have a different implementation there. Once we have bandwidth we'll release more of the on-platform implementation which should resolve the issue. Very very small team behind this project, apologies for the delay

Aug 20 '25 14:08 isaacrob-roboflow

btw how does it work if you use bfloat16?

Aug 21 '25 16:08 isaacrob-roboflow

Using bfloat16 fixes this issue and provides the same improvement.

model.optimize_for_inference(dtype=torch.bfloat16)

Sep 22 '25 23:09 djaygier

new model definition should be much more stable in half precision

Oct 02 '25 23:10 isaacrob-roboflow

rf-detr rf-detr copied to clipboard

Severe accuracy drop when using optimize_for_inference(dtype=torch.float16)

Search before asking

Bug

Environment

Minimal Reproducible Example

Additional

Are you willing to submit a PR?

rf-detr
rf-detr copied to clipboard