mmdetection icon indicating copy to clipboard operation
mmdetection copied to clipboard

[Bug] Missing 'info' field in cat_dataset.zip causing evaluation error

Open codingbus821 opened this issue 5 months ago • 2 comments

Checklist

  • [x] I have searched related issues but cannot get the expected help.
  • [x] I have read the [FAQ documentation](https://mmdetection.readthedocs.io/en/latest/faq.html) but cannot get the expected help.
  • [x] The bug has not been fixed in the latest version.

Describe the bug The cat_dataset.zip provided in the GroundingDINO tutorial is missing the required 'info' field in COCO format annotation files (test.json and trainval.json), causing KeyError during evaluation phase.

Reproduction

  1. What command or script did you run?
# Download dataset as instructed in README
cd mmdetection
wget https://download.openmmlab.com/mmyolo/data/cat_dataset.zip
unzip cat_dataset.zip -d data/cat/

# Run training
bash tools/dist_train.sh configs/grounding_dino/grounding_dino_swin-t_finetune_8xb2_20e_cat.py 1

# Or run testing
python tools/test.py configs/grounding_dino/grounding_dino_swin-t_finetune_8xb2_20e_cat.py https://download.openmmlab.com/mmdetection/v3.0/grounding_dino/groundingdino_swint_ogc_mmdet-822d7e9d.pth
  1. Did you make any modifications on the code or config? Did you understand what you have modified? No modifications were made. Used the original config file and followed the exact instructions from the README.

  2. What dataset did you use? Cat dataset from https://download.openmmlab.com/mmyolo/data/cat_dataset.zip as mentioned in configs/grounding_dino/README.md

Environment

  1. Please run python mmdet/utils/collect_env.py to collect necessary environment information and paste it here.
sys.platform: linux
Python: 3.10.18 | packaged by conda-forge | (main, Jun  4 2025, 14:45:41) [GCC 13.3.0]
CUDA available: True
MUSA available: False
GPU 0,1: NVIDIA GeForce RTX 4090
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 11.8, V11.8.89
GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
PyTorch: 2.7.1+cu118
TorchVision: 0.22.1+cu118
OpenCV: 4.12.0
MMEngine: 0.10.7
MMDetection: 3.3.0+
  1. PyTorch installed via conda, no special environment variables set.

Error traceback

07/28 10:55:36 - mmengine - INFO - Evaluating bbox...
[rank0]: Traceback (most recent call last):
[rank0]:   File "/home/user/Desktop/Dron_Detection/mmdetection/./tools/train.py", line 121, in <module>
[rank0]:     main()
[rank0]:   File "/home/user/Desktop/Dron_Detection/mmdetection/./tools/train.py", line 117, in main
[rank0]:     runner.train()
[rank0]:   File "/home/user/anaconda3/envs/mm/lib/python3.10/site-packages/mmengine/runner/runner.py", line 1777, in train
[rank0]:     model = self.train_loop.run()  # type: ignore
[rank0]:   File "/home/user/anaconda3/envs/mm/lib/python3.10/site-packages/mmengine/runner/loops.py", line 105, in run
[rank0]:     self.runner.val_loop.run()
[rank0]:   File "/home/user/anaconda3/envs/mm/lib/python3.10/site-packages/mmengine/runner/loops.py", line 382, in run
[rank0]:     metrics = self.evaluator.evaluate(len(self.dataloader.dataset))
[rank0]:   File "/home/user/anaconda3/envs/mm/lib/python3.10/site-packages/mmengine/evaluator/evaluator.py", line 79, in evaluate
[rank0]:     _results = metric.evaluate(size)
[rank0]:   File "/home/user/anaconda3/envs/mm/lib/python3.10/site-packages/mmengine/evaluator/metric.py", line 133, in evaluate
[rank0]:     _metrics = self.compute_metrics(results)  # type: ignore
[rank0]:   File "/home/user/Desktop/Dron_Detection/mmdetection/mmdet/evaluation/metrics/coco_metric.py", line 462, in compute_metrics
[rank0]:     coco_dt = self._coco_api.loadRes(predictions)
[rank0]:   File "/home/user/anaconda3/envs/mm/lib/python3.10/site-packages/pycocotools/coco.py", line 314, in loadRes
[rank0]:     res.dataset['info'] = copy.deepcopy(self.dataset['info'])
[rank0]: KeyError: 'info'

Bug fix I have identified the root cause: the annotation files in cat_dataset.zip are missing the required 'info' field that is mandatory in COCO format.

Temporary workaround: Add the missing 'info' field to both data/cat/annotations/test.json and data/cat/annotations/trainval.json:

import json

def fix_annotation(file_path):
    with open(file_path, 'r') as f:
        data = json.load(f)
    
    if 'info' not in data:
        data['info'] = {
            "description": "Cat Dataset",
            "version": "1.0",
            "year": 2025,
            "contributor": "",
            "date_created": "2025-07-28"
        }
        
        with open(file_path, 'w') as f:
            json.dump(data, f, indent=2)

fix_annotation('data/cat/annotations/test.json')
fix_annotation('data/cat/annotations/trainval.json')

Proposed solution:

  1. Update the cat_dataset.zip to include proper COCO format with 'info' field
  2. Update the README.md to include troubleshooting information for this issue

I'm willing to create a PR to fix this if the maintainers would like. This would help future users avoid this common issue when following the tutorial.

codingbus821 avatar Jul 28 '25 02:07 codingbus821

Hello,have you encouter the val metrics are all 0.00 or nearly 0.00 when finetune groungingdino.I follow the official instruction of cat too,but in infer via image_demo.py it seem no problem.

connorye avatar Jul 31 '25 09:07 connorye

The performance when training with the cat dataset is as follows.

Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.893 Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=1000 ] = 1.000 Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=1000 ] = 0.913 Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = -1.000 Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.893 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.940 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=300 ] = 0.943 Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=1000 ] = 0.943 Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=1000 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=1000 ] = -1.000 Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=1000 ] = 0.943

codingbus821 avatar Sep 10 '25 01:09 codingbus821