mmtracking icon indicating copy to clipboard operation
mmtracking copied to clipboard

TypeError: CocoVideoDataset: list indices must be integers or slices, not str

Open Youngforever0911 opened this issue 2 years ago • 8 comments

/home/nhl510wm/xxy/code/mmtracking/mmtrack/core/utils/ UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. warnings.warn( /home/nhl510wm/xxy/code/mmtracking/mmtrack/core/utils/ UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed. warnings.warn( 2022-10-25 22:30:34,450 - mmtrack - INFO - Environment info:

sys.platform: linux Python: 3.9.13 (main, Oct 13 2022, 21:15:33) [GCC 11.2.0] CUDA available: True GPU 0: NVIDIA GeForce RTX 3090 CUDA_HOME: None GCC: gcc (Ubuntu 11.2.0-19ubuntu1) 11.2.0 PyTorch: 1.10.0+cu111 PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.2.3 (Git Hash 7336ca9f055cf1bfa13efb658fe15dc9b41f0740)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • LAPACK is enabled (usually provided by MKL)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  • CuDNN 8.0.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.11.0+cu111 OpenCV: 4.6.0 MMCV: 1.5.3 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 11.1 MMTracking: 0.14.0+2c7a8af

2022-10-25 22:30:34,451 - mmtrack - INFO - Distributed training: False 2022-10-25 22:30:35,631 - mmtrack - INFO - Config: model = dict( type='DFF', detector=dict( type='FasterRCNN', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(3, ), strides=(1, 2, 2, 1), dilations=(1, 1, 1, 2), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch', init_cfg=dict( type='Pretrained', checkpoint='torchvision://resnet50')), neck=dict( type='ChannelMapper', in_channels=[2048], out_channels=512, kernel_size=3), rpn_head=dict( type='RPNHead', in_channels=512, feat_channels=512, anchor_generator=dict( type='AnchorGenerator', scales=[4, 8, 16, 32], ratios=[0.5, 1.0, 2.0], strides=[16]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[1.0, 1.0, 1.0, 1.0]), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), loss_bbox=dict( type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)), roi_head=dict( type='StandardRoIHead', bbox_roi_extractor=dict( type='SingleRoIExtractor', roi_layer=dict( type='RoIAlign', output_size=7, sampling_ratio=2), out_channels=512, featmap_strides=[16]), bbox_head=dict( type='Shared2FCBBoxHead', in_channels=512, fc_out_channels=1024, roi_feat_size=7, num_classes=30, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.2, 0.2, 0.2, 0.2]), reg_class_agnostic=False, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0))), train_cfg=dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=0, pos_weight=-1, debug=False), rpn_proposal=dict( nms_across_levels=False, nms_pre=6000, nms_post=1000, max_num=1000, nms_thr=0.7, min_bbox_size=0), rcnn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=512, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False)), test_cfg=dict( rpn=dict( nms_across_levels=False, nms_pre=6000, nms_post=300, max_num=300, nms_thr=0.7, min_bbox_size=0), rcnn=dict( score_thr=0.0001, nms=dict(type='nms', iou_threshold=0.5), max_per_img=100))), motion=dict( type='FlowNetSimple', img_scale_factor=0.5, init_cfg=dict( type='Pretrained', checkpoint= '' )), train_cfg=None, test_cfg=dict(key_frame_interval=10)) dataset_type = 'CocoVideoDataset' classes = ('ps', 'head') data_root = '/home/nhl510wm/xxy/data/detect/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadMultiImagesFromFile'), dict(type='SeqLoadAnnotations', with_bbox=True, with_track=True), dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=16), dict( type='VideoCollect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_instance_ids']), dict(type='ConcatVideoReferences'), dict(type='SeqDefaultFormatBundle', ref_prefix='ref') ] test_pipeline = [ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1000, 600), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=16), dict(type='ImageToTensor', keys=['img']), dict(type='VideoCollect', keys=['img']) ]) ] data = dict( samples_per_gpu=1, workers_per_gpu=2, train=[ dict( type='CocoVideoDataset', ann_file= '/home/nhl510wm/xxy/data/detect/mmtracking data/instances_jnu_3.json', img_prefix='/home/nhl510wm/xxy/data/detect/image_all', classes=('ps', 'head'), ref_img_sampler=dict( num_ref_imgs=1, frame_range=9, filter_key_img=False, method='uniform'), pipeline=[ dict(type='LoadMultiImagesFromFile'), dict( type='SeqLoadAnnotations', with_bbox=True, with_track=True), dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=16), dict( type='VideoCollect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_instance_ids']), dict(type='ConcatVideoReferences'), dict(type='SeqDefaultFormatBundle', ref_prefix='ref') ]), dict( type='CocoVideoDataset', load_as_video=False, ann_file= '/home/nhl510wm/xxy/data/detect/mmtracking data/instances_jnu_3.json', img_prefix='/home/nhl510wm/xxy/data/detect/image_all', classes=('ps', 'head'), ref_img_sampler=dict( num_ref_imgs=1, frame_range=0, filter_key_img=False, method='uniform'), pipeline=[ dict(type='LoadMultiImagesFromFile'), dict( type='SeqLoadAnnotations', with_bbox=True, with_track=True), dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=16), dict( type='VideoCollect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_instance_ids']), dict(type='ConcatVideoReferences'), dict(type='SeqDefaultFormatBundle', ref_prefix='ref') ]) ], val=dict( type='CocoVideoDataset', ann_file= '/home/nhl510wm/xxy/data/detect/mmtracking data/instances_jnu_3.json', img_prefix='/home/nhl510wm/xxy/data/detect/image_all', classes=('ps', 'head'), ref_img_sampler=None, pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1000, 600), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=16), dict(type='ImageToTensor', keys=['img']), dict(type='VideoCollect', keys=['img']) ]) ], test_mode=True), test=dict( type='CocoVideoDataset', ann_file= '/home/nhl510wm/xxy/data/detect/mmtracking data/instances_jnu_3.json', img_prefix='/home/nhl510wm/xxy/data/detect/image_all', classes=('ps', 'head'), ref_img_sampler=None, pipeline=[ dict(type='LoadImageFromFile'), dict( type='MultiScaleFlipAug', img_scale=(1000, 600), flip=False, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict( type='Normalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='Pad', size_divisor=16), dict(type='ImageToTensor', keys=['img']), dict(type='VideoCollect', keys=['img']) ]) ], test_mode=True)) optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=None) checkpoint_config = dict(interval=1) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')]) dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] opencv_num_threads = 0 mp_start_method = 'fork' work_dir = '/home/nhl510wm/xxy/code/work_dirs_tra/MaskTrackRCNN' gpu_ids = [0]

2022-10-25 22:30:35,631 - mmtrack - INFO - Set random seed to 1006476471, deterministic: False 2022-10-25 22:30:36,689 - mmtrack - INFO - initialize ResNet with init_cfg {'type': 'Pretrained', 'checkpoint': 'torchvision://resnet50'} 2022-10-25 22:30:36,689 - mmcv - INFO - load model from: torchvision://resnet50 2022-10-25 22:30:36,689 - mmcv - INFO - load checkpoint from torchvision path: torchvision://resnet50 2022-10-25 22:30:36,783 - mmcv - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: fc.weight, fc.bias

2022-10-25 22:30:36,807 - mmtrack - INFO - initialize ChannelMapper with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'} 2022-10-25 22:30:36,867 - mmtrack - INFO - initialize RPNHead with init_cfg {'type': 'Normal', 'layer': 'Conv2d', 'std': 0.01} 2022-10-25 22:30:36,883 - mmtrack - INFO - initialize Shared2FCBBoxHead with init_cfg [{'type': 'Normal', 'std': 0.01, 'override': {'name': 'fc_cls'}}, {'type': 'Normal', 'std': 0.001, 'override': {'name': 'fc_reg'}}, {'type': 'Xavier', 'distribution': 'uniform', 'override': [{'name': 'shared_fcs'}, {'name': 'cls_fcs'}, {'name': 'reg_fcs'}]}] 2022-10-25 22:30:37,136 - mmtrack - INFO - initialize FlowNetSimple with init_cfg {'type': 'Pretrained', 'checkpoint': ''} 2022-10-25 22:30:37,136 - mmcv - INFO - load model from: 2022-10-25 22:30:37,136 - mmcv - INFO - load checkpoint from http path: loading annotations into memory... Done (t=0.06s) creating index... Traceback (most recent call last): File "/home/nhl510wm/anaconda3/envs/xxy_mmd/lib/python3.9/site-packages/mmcv/utils/", line 69, in build_from_cfg return obj_cls(**args) File "/home/nhl510wm/xxy/code/mmtracking/mmtrack/datasets/", line 46, in init super().init(*args, **kwargs) File "/home/nhl510wm/xxy/code/mmdetection/mmdet/datasets/", line 95, in init self.data_infos = self.load_annotations(local_path) File "/home/nhl510wm/xxy/code/mmtracking/mmtrack/datasets/", line 61, in load_annotations data_infos = self.load_video_anns(ann_file) File "/home/nhl510wm/xxy/code/mmtracking/mmtrack/datasets/", line 81, in load_video_anns img_ids = self.coco.get_img_ids_from_vid(vid_id) File "/home/nhl510wm/xxy/code/mmtracking/mmtrack/datasets/parsers/", line 124, in get_img_ids_from_vid ids[img_info['frame_id']] = img_info['id'] TypeError: list indices must be integers or slices, not str

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/nhl510wm/xxy/code/mmtracking/tools/", line 217, in main() File "/home/nhl510wm/xxy/code/mmtracking/tools/", line 192, in main datasets = [build_dataset(] File "/home/nhl510wm/xxy/code/mmdetection/mmdet/datasets/", line 63, in build_dataset dataset = ConcatDataset([build_dataset(c, default_args) for c in cfg]) File "/home/nhl510wm/xxy/code/mmdetection/mmdet/datasets/", line 63, in dataset = ConcatDataset([build_dataset(c, default_args) for c in cfg]) File "/home/nhl510wm/xxy/code/mmdetection/mmdet/datasets/", line 82, in build_dataset dataset = build_from_cfg(cfg, DATASETS, default_args) File "/home/nhl510wm/anaconda3/envs/xxy_mmd/lib/python3.9/site-packages/mmcv/utils/", line 72, in build_from_cfg raise type(e)(f'{}: {e}') TypeError: CocoVideoDataset: list indices must be integers or slices, not str index created!

Process finished with exit code 1

Youngforever0911 avatar Oct 25 '22 14:10 Youngforever0911

{ "images": [ { "height": 1026, "width": 1295, "id": 1, "file_name": "20190830T115515_109.png", "frame_id": "109", "video_id": 0 }, "type": "instances", "annotations": [ { "segmentation": [ [ 0, 0, 0, 0, 0, 0, 0, 0 ] ], "area": 1, "iscrowd": "false", "ignore": "false", "image_id": 1, "bbox": [ 0, 0, 0, 0 ], "category_id": 1, "id": 1, "video_id": 0, "instance_id": "false", "occluded": "false", "truncated": "false", "is_vid_train_frame": "true", "visibility": 1.0 }, { "segmentation": [ [ 530, 357, 530, 864, 1074, 864, 1074, 357 ] ], "area": 276860, "iscrowd": "false", "ignore": "false", "image_id": 1, "bbox": [ 530, 357, 544, 507 ], "category_id": 2, "id": 2, "video_id": 0, "instance_id": "false", "occluded": "false", "truncated": "false", "is_vid_train_frame": "true", "visibility": 1.0 },

Youngforever0911 avatar Oct 25 '22 14:10 Youngforever0911

I don't know if it's a data problem or a code problem?

Youngforever0911 avatar Oct 25 '22 14:10 Youngforever0911

Hi, I have same problem. Did you solve it?

YOOHYOJEONG avatar Nov 01 '22 01:11 YOOHYOJEONG

Hi, I have same problem. Did you solve it?

Not yet. But I remember that there were similar problems in the past issues. I refer to it and haven't solved it yet. You can try it.

Youngforever0911 avatar Nov 01 '22 01:11 Youngforever0911

Can you tell me issue number of "similar problems in the past issues" are?

YOOHYOJEONG avatar Nov 01 '22 02:11 YOOHYOJEONG

Hello,have you solve the problem? By the way, Could you please tell me how to make my own CocoVideoDataset? I can pay for you. @Youngforever0911

lijoe123 avatar Dec 19 '22 02:12 lijoe123

Hello,have you solve the problem? By the way, Could you please tell me how to make my own CocoVideoDataset? I can pay for you. @Youngforever0911

Have you get it?I have the same difficulty too

meikorol avatar Apr 04 '23 03:04 meikorol

I'm having this same issue... I wonder if anyone knows why?

Edwin-Sanchez2003 avatar Nov 08 '23 04:11 Edwin-Sanchez2003