D-FINE icon indicating copy to clipboard operation
D-FINE copied to clipboard

Performance Issue Fine-Tuning D-FINE S with custom dataset

Open sergiosanchoasensio opened this issue 1 year ago • 10 comments

Hi everyone,

I’m fine-tuning the D-FINE S model initialized with Objects365 weights using a custom dataset, but I’m encountering significantly lower performance than expected.

Performance Comparison In just a single epoch:

  • D-FINE S achieves mAP@50=0.384.
  • YOLOv11 S (initialized with COCO weights) achieves mAP@50=0.842.

1. Dataset Preparation:

  • Converted custom dataset annotations (x1,y1,x2,y2) to COCO format (x1, y1, w, h).
  • Since the dataset includes truncated images, I added in the train.py: ImageFile.LOAD_TRUNCATED_IMAGES = True
  • This dataset contains only one class, category_id=0, so I set num_classes: 1 and remap_mscoco_category: False.

2. Configuration Changes:

  • Using dfine_hgnetv2_s_obj2custom.yml as the base configuration.
  • Modified only the dataset path and adjusted the batch size to 8 for both training and validation.

3. Training Command:

CUDA_VISIBLE_DEVICES=0 torchrun --master_port=7777 --nproc_per_node=1 train.py \
  -c configs/dfine/custom/objects365/dfine_hgnetv2_s_obj2custom.yml \
  --seed=0 -t dfine_s_obj365.pth

What could be causing the poor performance? I’d appreciate any insights or suggestions for debugging this issue. Thanks in advance!

PS: I tried fine-tuning for 20 epochs, and the mAP@50 only improves slightly to 0.4. Using COCO weights instead of Objects365 weights does not lead to any improvement either.

sergiosanchoasensio avatar Dec 10 '24 16:12 sergiosanchoasensio

Can you try training from scratch without any pre-training weights?

Peterande avatar Dec 11 '24 07:12 Peterande

Can you try training from scratch without any pre-training weights?

I trained the model from scratch but obtained an mAP@50 of -1.000. I suspect this may be due to the annotation format not aligning with D-FINE's requirements. Could you kindly clarify the correct format? Should the annotations use COCO format x1, y1, w, h, YOLO format (normalized, compatible with MIT YOLOv9) x_center, y_center, w, h, or another standard?

Thank you in advance.

sergiosanchoasensio avatar Dec 18 '24 15:12 sergiosanchoasensio

Same here - I tried these 2 variants : dfine_hgnetv2_${model}_obj365.yml and dfine_hgnetv2_${model}_custom.yml and the mAP50 never goes above 0.35 while even YOLOv9c gets to 0.7

prashant-dn avatar Dec 27 '24 07:12 prashant-dn

@sergiosanchoasensio Could you please tell me how to use the pretrained model checkpoint file. Please share codes if you are no issues in sharing

AnishMathewOommen avatar Dec 31 '24 07:12 AnishMathewOommen

I also encountered the same problem. Does anyone solve it?

S130111 avatar Jan 07 '25 03:01 S130111

@AnishMathewOommen I used this command for single GPU setup and custom dataset torchrun train.py -c configs/dfine/custom/objects365/dfine_hgnetv2_m_obj2custom.yml --use-amp --seed=0 -t dfine_m_coco.pth

ArgoHA avatar Jan 07 '25 11:01 ArgoHA

I am also experiencing terrible performanes with dfine, it just sucks (more likely I suck at training) but I am using this dataset with just one classes no matter what I do I end up with map at 0.15

@FrancescoSaverioZuppichini try out this repo

ArgoHA avatar Jul 09 '25 14:07 ArgoHA

@FrancescoSaverioZuppichini try out this repo

Thanks a lot man, is on my todo list!

@FrancescoSaverioZuppichini try out this repo

man what the fuck your code is amazing, it is training so much faster and it is so well done - are you an advance AI from the future?