mmdetection icon indicating copy to clipboard operation
mmdetection copied to clipboard

Difference in visualization results between MMGroundingDINO and GroundingDINO on a single image

Open qianbo-x opened this issue 4 months ago • 0 comments

I performed detection on the image below using both MMGroundingDINO and GroundingDINO, with the following commands respectively:

Image

python demo/image_demo.py 000000002299.jpg configs/grounding_dino/grounding_dino_swin-t_pretrain_obj365_goldg_cap4m.py --weights groundingdino_swint_ogc_mmdet-822d7e9d.pth --texts 'person' (for GroundingDINO)

python demo/image_demo.py 000000002299.jpg configs/mm_grounding_dino/grounding_dino_swin-t_pretrain_obj365.py --weights grounding_dino_swin-t_pretrain_obj365_goldg_grit9m_v3det_20231204_095047-b448804b.pth --texts 'persons'(for MMGroundingDINO)

As shown in the results, the predicted scores from MMGroundingDINO are generally lower, which leads to some missed detections. In contrast, GroundingDINO gives relatively higher scores.

MMGroundingDINO GroundingDINO

Previously, I trained a prompt-based detector based on the DINO codebase and also observed the issue of low predicted scores. When I printed out the DINO baseline's prediction scores on this image, they were already quite low.

I'm wondering have you encountered a similar issue?

Looking forward to your reply~

Note: This image is from the COCO validation set, with the path: val2017/000000002299.jpg.

qianbo-x avatar Aug 05 '25 07:08 qianbo-x