mdetr
mdetr copied to clipboard
KeyError: 'gqa_accuracy_answer_total_unscaled'
This mistake is really strange... I follow the readme for training MDETR on CLEVR. Firstly, I've ran the following command:
python run_with_submitit.py --dataset_config configs/clevr_pretrain.json --backbone "resnet18" --num_queries 25 --batch_size 64 --schedule linear_with_warmup --text_encoder_type distilroberta-base --output-dir step1 --epochs 5 --lr_drop 20 --nodes 1 --ngpus 1
The only difference with the one in the readme is that I've used run_with_submitit.py
and added --nodes 1 --ngpus 1
parameters.
The training has gone well and the job has finished successfully. Then I've ran
python run_with_submitit.py --dataset_config configs/clevr.json --backbone "resnet18" --num_queries 25 --batch_size 64 --schedule linear_with_warmup --text_encoder_type distilroberta-base --output-dir step2 --load ~/MDETR/mdetr/checkpoint/pchelintsev/experiments/19906/BEST_checkpoint.pth --epochs 5 --lr_drop 20 --nodes 1 --ngpus 1
And after the first epoch and testing I've gotten the following in 28574_0_log.err
file (warnings were deleted):
submitit ERROR (2021-09-27 13:01:24,999) - Submitted job triggered an exception
Traceback (most recent call last):
File "/home/pchelintsev/anaconda3/envs/mdetr_env/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/pchelintsev/anaconda3/envs/mdetr_env/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/pchelintsev/anaconda3/envs/mdetr_env/lib/python3.8/site-packages/submitit/core/_submit.py", line 11, in <module>
submitit_main()
File "/home/pchelintsev/anaconda3/envs/mdetr_env/lib/python3.8/site-packages/submitit/core/submission.py", line 71, in submitit_main
process_job(args.folder)
File "/home/pchelintsev/anaconda3/envs/mdetr_env/lib/python3.8/site-packages/submitit/core/submission.py", line 64, in process_job
raise error
File "/home/pchelintsev/anaconda3/envs/mdetr_env/lib/python3.8/site-packages/submitit/core/submission.py", line 53, in process_job
result = delayed.result()
File "/home/pchelintsev/anaconda3/envs/mdetr_env/lib/python3.8/site-packages/submitit/core/utils.py", line 128, in result
self._result = self.function(*self.args, **self.kwargs)
File "run_with_submitit.py", line 98, in __call__
detection.main(self.args)
File "/home/pchelintsev/MDETR/mdetr/main.py", line 614, in main
metric = test_stats["gqa_accuracy_answer_total_unscaled"]
KeyError: 'gqa_accuracy_answer_total_unscaled'
Why the loss is missing?((
Also, here is the end of 28574_0_log.out
file:
Accumulating evaluation results...
DONE (t=70.57s).
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.581
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.893
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.660
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.374
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.578
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.768
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.302
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.729
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.741
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.637
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.741
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.842
submitit ERROR (2021-09-27 13:01:24,999) - Submitted job triggered an exception
I noticed a strange moment in main.py. Likely, this is not the reason, but this might help in searching the mistake.
This is the line:
But here we can see that in case of CLEVR updating works only for 'clevr_something' keys (because in the config file we've put only ["clevr"])
So
gqa_accuracy_answer_total_unscaled
cannot emerge...
@TopCoder2K Did you manage to find a solution to this?
Not sure why this is hardcoded. When doing QA, the list of keys (metrics) available when using the CLEVR dataset is as shown below. Changing 'gqa_accuracy_answer_total_unscaled' to 'clevr_accuracy_answer_total_unscaled' in the code should fix the problem.
dict_keys([
'clevr_loss',
'clevr_loss_ce',
'clevr_loss_bbox',
'clevr_loss_giou',
'clevr_loss_contrastive_align',
'clevr_loss_ce_0',
'clevr_loss_bbox_0',
'clevr_loss_giou_0',
'clevr_loss_contrastive_align_0',
'clevr_loss_ce_1',
'clevr_loss_bbox_1',
'clevr_loss_giou_1',
'clevr_loss_contrastive_align_1',
'clevr_loss_ce_2',
'clevr_loss_bbox_2',
'clevr_loss_giou_2',
'clevr_loss_contrastive_align_2',
'clevr_loss_ce_3',
'clevr_loss_bbox_3',
'clevr_loss_giou_3',
'clevr_loss_contrastive_align_3',
'clevr_loss_ce_4',
'clevr_loss_bbox_4',
'clevr_loss_giou_4',
'clevr_loss_contrastive_align_4',
'clevr_loss_answer_type',
'clevr_loss_answer_binary',
'clevr_loss_answer_reg',
'clevr_loss_answer_attr',
'clevr_loss_ce_unscaled',
'clevr_loss_bbox_unscaled',
'clevr_loss_giou_unscaled',
'clevr_cardinality_error_unscaled',
'clevr_loss_contrastive_align_unscaled',
'clevr_loss_ce_0_unscaled',
'clevr_loss_bbox_0_unscaled',
'clevr_loss_giou_0_unscaled',
'clevr_cardinality_error_0_unscaled',
'clevr_loss_contrastive_align_0_unscaled',
'clevr_loss_ce_1_unscaled',
'clevr_loss_bbox_1_unscaled',
'clevr_loss_giou_1_unscaled',
'clevr_cardinality_error_1_unscaled',
'clevr_loss_contrastive_align_1_unscaled',
'clevr_loss_ce_2_unscaled',
'clevr_loss_bbox_2_unscaled',
'clevr_loss_giou_2_unscaled',
'clevr_cardinality_error_2_unscaled',
'clevr_loss_contrastive_align_2_unscaled',
'clevr_loss_ce_3_unscaled',
'clevr_loss_bbox_3_unscaled',
'clevr_loss_giou_3_unscaled',
'clevr_cardinality_error_3_unscaled',
'clevr_loss_contrastive_align_3_unscaled',
'clevr_loss_ce_4_unscaled',
'clevr_loss_bbox_4_unscaled',
'clevr_loss_giou_4_unscaled',
'clevr_cardinality_error_4_unscaled',
'clevr_loss_contrastive_align_4_unscaled',
'clevr_loss_answer_type_unscaled',
'clevr_accuracy_answer_type_unscaled',
'clevr_loss_answer_binary_unscaled',
'clevr_accuracy_answer_binary_unscaled',
'clevr_loss_answer_reg_unscaled',
'clevr_accuracy_answer_reg_unscaled',
'clevr_loss_answer_attr_unscaled',
'clevr_accuracy_answer_attr_unscaled',
'clevr_accuracy_answer_total_unscaled',
'clevr_coco_eval_bbox'
])