Performance Issue Fine-Tuning D-FINE S with custom dataset
Hi everyone,
I’m fine-tuning the D-FINE S model initialized with Objects365 weights using a custom dataset, but I’m encountering significantly lower performance than expected.
Performance Comparison In just a single epoch:
- D-FINE S achieves mAP@50=0.384.
- YOLOv11 S (initialized with COCO weights) achieves mAP@50=0.842.
1. Dataset Preparation:
- Converted custom dataset annotations (x1,y1,x2,y2) to COCO format (x1, y1, w, h).
- Since the dataset includes truncated images, I added in the train.py:
ImageFile.LOAD_TRUNCATED_IMAGES = True - This dataset contains only one class, category_id=0, so I set
num_classes: 1andremap_mscoco_category: False.
2. Configuration Changes:
- Using dfine_hgnetv2_s_obj2custom.yml as the base configuration.
- Modified only the dataset path and adjusted the batch size to 8 for both training and validation.
3. Training Command:
CUDA_VISIBLE_DEVICES=0 torchrun --master_port=7777 --nproc_per_node=1 train.py \
-c configs/dfine/custom/objects365/dfine_hgnetv2_s_obj2custom.yml \
--seed=0 -t dfine_s_obj365.pth
What could be causing the poor performance? I’d appreciate any insights or suggestions for debugging this issue. Thanks in advance!
PS: I tried fine-tuning for 20 epochs, and the mAP@50 only improves slightly to 0.4. Using COCO weights instead of Objects365 weights does not lead to any improvement either.
Can you try training from scratch without any pre-training weights?
Can you try training from scratch without any pre-training weights?
I trained the model from scratch but obtained an mAP@50 of -1.000. I suspect this may be due to the annotation format not aligning with D-FINE's requirements. Could you kindly clarify the correct format? Should the annotations use COCO format x1, y1, w, h, YOLO format (normalized, compatible with MIT YOLOv9) x_center, y_center, w, h, or another standard?
Thank you in advance.
Same here - I tried these 2 variants :
dfine_hgnetv2_${model}_obj365.yml and dfine_hgnetv2_${model}_custom.yml
and the mAP50 never goes above 0.35 while even YOLOv9c gets to 0.7
@sergiosanchoasensio Could you please tell me how to use the pretrained model checkpoint file. Please share codes if you are no issues in sharing
I also encountered the same problem. Does anyone solve it?
@AnishMathewOommen I used this command for single GPU setup and custom dataset
torchrun train.py -c configs/dfine/custom/objects365/dfine_hgnetv2_m_obj2custom.yml --use-amp --seed=0 -t dfine_m_coco.pth
I am also experiencing terrible performanes with dfine, it just sucks (more likely I suck at training) but I am using this dataset with just one classes no matter what I do I end up with map at 0.15
@FrancescoSaverioZuppichini try out this repo
@FrancescoSaverioZuppichini try out this repo
man what the fuck your code is amazing, it is training so much faster and it is so well done - are you an advance AI from the future?