Research into DEIMv2
Hi there, I have been using YOLOX on edge devices, and are now looking to switch to newer DETE-based models for better accuracy and performance. I am curious whether you have taken a look at the new DEIMv2 that was released recently, which claimed to be SOTA, improved upon D-FINE and uses the new DINOv3 with many more model sizes.
It would be great if you can look into it if you have the time to see if it is indeed better than D-FINE in both training and inference and can use it to improve / update your implementations even further. Thank you
Hey, it looks interesting, but:
- If you check speed/accuracy from their own benchmarks, you will see that D-FINE is still better up to L size, because DEIMv2 S latency = D-FINE M latency. But DEIMv2 L and X are interesting,. Tiny models are handy too.
- They didn't compare with RF-DETR which was also saying it is SOTA.
They mention that they didn't optimize inference speed, so maybe they can get even better. For now my main question is if there are models better in both speed and accuracy than D-FINE and if the difference is meaningful.
I will continue looking into newer models. If you happen to do apples tp apples comparison of RF-DETR and DEIMv2 - please share.
I saw you started to work on adding a segmentation head to the custom_d_fine 🥇
Will be so cool. Fingers crossed.
Another interesting one https://arxiv.org/pdf/2510.25257
So DEIMv2 will be skipped. I will consider RT-DETRv4. But latencies reported look a little weird. Should be tested