rf-detr icon indicating copy to clipboard operation
rf-detr copied to clipboard

Severe accuracy drop when using optimize_for_inference(dtype=torch.float16)

Open LilG-Cheng-Ju opened this issue 3 months ago • 5 comments
trafficstars

Search before asking

  • [x] I have searched the RF-DETR issues and found no similar bug report.

Bug

Hi, thanks for your great work on RFDETR!

I encountered a severe accuracy drop when running inference with FP16.
Here are the details:

  • Tested models:

    • Official RFDETRNano
    • Official RFDETRSmall
    • My custom-trained RFDETRNano (trained on my own dataset)
  • Code example:

from rfdetr import RFDETRNano
import torch

model = RFDETRNano()
model.optimize_for_inference(dtype=torch.float16)

# Run inference...

Once I apply optimize_for_inference(dtype=torch.float16), the inference accuracy drops significantly compared to FP32. This happens across both official pretrained models and my own trained model. Could you confirm if FP16 inference is supported, or if additional steps are needed to maintain accuracy?

Thanks!

Environment

  • RF-DETR 1.2.0
  • OS: both ubuntu 22.04 and Windows 11
  • Python 3.10
  • torch 2.7.0
  • CUDA 12.4
  • GPU both RTX 3060 and RTX A5000

Minimal Reproducible Example

image = Image.open('demo.jpg')
model = RFDETRNano()
model.optimize_for_inference(dtype=torch.float16)
detections = model.predict(image, conf=0.2)

Additional

I also tried exporting the model to TorchScript and compared the accuracy between

  • using .half() before export
  • and without .half()

In both cases, I observed the same issue: inference accuracy drops significantly when running in FP16.

Are you willing to submit a PR?

  • [ ] Yes, I'd like to help by submitting a PR!

LilG-Cheng-Ju avatar Aug 19 '25 02:08 LilG-Cheng-Ju

Same here, I used a custom trained RFDETRMedium.

westlinkin avatar Aug 20 '25 06:08 westlinkin

We're aware of the issue, same thing that causes degradation in TRT which is resolved by this approach https://github.com/roboflow/rf-detr/issues/176 but I guess that isn't relevant for torchscript

Models trained on and exported from roboflow platform SHOULD work out of the box in fp16 with no decay as we have a different implementation there. Once we have bandwidth we'll release more of the on-platform implementation which should resolve the issue. Very very small team behind this project, apologies for the delay

isaacrob-roboflow avatar Aug 20 '25 14:08 isaacrob-roboflow

btw how does it work if you use bfloat16?

isaacrob-roboflow avatar Aug 21 '25 16:08 isaacrob-roboflow

Using bfloat16 fixes this issue and provides the same improvement.

model.optimize_for_inference(dtype=torch.bfloat16)

djaygier avatar Sep 22 '25 23:09 djaygier

new model definition should be much more stable in half precision

isaacrob-roboflow avatar Oct 02 '25 23:10 isaacrob-roboflow