OpenTAD
OpenTAD copied to clipboard
AdaTAD works worse than expected on IKEA ASM dataset
Hi, I have tried to train a AdaTAD model on IKEA ASM dataset. I followed the THUMOS config using VideoMAE base model.
The final epoch output is:
2024-07-15 09:17:18 Train INFO: [Train]: [059][00050/00126] Loss=0.5143 cls_loss=0.2856 reg_loss=0.2287 lr_backbone=3.9e-05 lr_det=3.9e-05 mem=4993MB
2024-07-15 09:22:19 Train INFO: [Train]: [059][00100/00126] Loss=0.5090 cls_loss=0.2780 reg_loss=0.2310 lr_backbone=3.8e-05 lr_det=3.8e-05 mem=4993MB
2024-07-15 09:25:03 Train INFO: [Train]: [059][00126/00126] Loss=0.5022 cls_loss=0.2728 reg_loss=0.2294 lr_backbone=3.8e-05 lr_det=3.8e-05 mem=4993MB
The evaluation result is:
2024-07-15 09:32:55 Train INFO: Evaluation starts...
2024-07-15 09:32:57 Train INFO: Loaded annotations from validation subset.
2024-07-15 09:32:57 Train INFO: Number of ground truth instances: 0
2024-07-15 09:32:57 Train INFO: Number of predictions: 234000
2024-07-15 09:32:57 Train INFO: Fixed threshold for tiou score: [0.3, 0.4, 0.5, 0.6, 0.7]
2024-07-15 09:32:57 Train INFO: Average-mAP: nan (%)
2024-07-15 09:32:57 Train INFO: mAP at tIoU 0.30 is nan%
2024-07-15 09:32:57 Train INFO: mAP at tIoU 0.40 is nan%
2024-07-15 09:32:57 Train INFO: mAP at tIoU 0.50 is nan%
2024-07-15 09:32:57 Train INFO: mAP at tIoU 0.60 is nan%
2024-07-15 09:32:57 Train INFO: mAP at tIoU 0.70 is nan%
2024-07-15 09:32:57 Train INFO: Training Over...
Using the model to infer a test video, I try to mark the actions at the bottom of the frame (first bar is GT, second bar is predicted). From the snapshot below, we can see that most of the actions are wrong.
When processing the dataset, I remove 'NA' label. No more extra processing. Any idea about how to improve this?