PaddleX icon indicating copy to clipboard operation
PaddleX copied to clipboard

模型训练Faster RCNN时出现IndexError: list index out of range

Open sssdhjh opened this issue 2 years ago • 1 comments

Checklist:

  1. 查找历史相关issue寻求解答
  2. 翻阅FAQ常见问题汇总和答疑
  3. 确认bug是否在新版本里还未修复
  4. 翻阅PaddleX API文档说明

描述问题

当我在Al studio尝试训练自制的标注集来测试时,在模型训练的过程中会出现IndexError: list index out of range的报错.但奇怪的是就在发布此issue的前一晚,是可以正常的进行模型训练,期间没改变过任何环境配置。重新创建新项目复现此环境依然会发生此错误。

复现

  1. 您是否已经正常运行我们提供的教程? 是

  2. 您是否在教程的基础上修改代码内容?还请您提供运行的代码

!cd ~ # 设置当前目录路径 !unzip data/data132692/VOCdataset.zip -d work/dataset # 将自制数据集进行解压 !pip install paddlex==2.1.0 -i https://mirror.baidu.com/pypi/simple # 下载paddlex

!paddlex --split_dataset --format VOC --dataset_dir 'work/dataset/VOCdataset' --val_value 0.2 --test_value 0.1

数据增强

import paddlex as pdx from paddlex import transforms as T train_transforms = T.Compose([ T.RandomHorizontalFlip(), T.RandomDistort(), T.RandomExpand(), T.RandomCrop(), T.ResizeByShort(short_size=250, max_size=-1), T.Normalize() ]) eval_transforms = T.Compose([ T.ResizeByShort(short_size=250, max_size=-1), T.Normalize() ])

加载数据集

train_dataset = pdx.datasets.VOCDetection( data_dir=r'work/dataset/VOCdataset', file_list=r'work/dataset/VOCdataset/train_list.txt', label_list=r'work/dataset/VOCdataset/labels.txt', transforms=train_transforms, # shuffle=True ) eval_dataset = pdx.datasets.VOCDetection( data_dir=r'work/dataset/VOCdataset', file_list=r'work/dataset/VOCdataset/val_list.txt', label_list=r'work/dataset/VOCdataset/labels.txt', transforms=eval_transforms)

模型训练 Faster RCNN

num_classes = len(train_dataset.labels)+1 print(num_classes) model = pdx.det.FasterRCNN(num_classes=num_classes, backbone='ResNet50') model.train( num_epochs=10, train_dataset=train_dataset, train_batch_size=1, eval_dataset=eval_dataset, save_interval_epochs=1, learning_rate=0.00025, # learning_rate=0.0001, # lr_decay_epochs = [3,6], # lr_decay_gamma = 0.1, metric='VOC', save_dir='output/faster_rcnn_r50_fpn', use_vdl=True)

至此模型训练出现 IndexError: list index out of range,报错信息在BUG描述中。

  1. 您使用的数据集是? (自己使用labelImg标注的10张图片,用于测试) VOCdataset.zip

  2. 请提供您出现的报错信息及相关log

执行代码:

num_classes = len(train_dataset.labels)+1 print(num_classes) model = pdx.det.FasterRCNN(num_classes=num_classes, backbone='ResNet50') model.train( num_epochs=10, train_dataset=train_dataset, train_batch_size=1, eval_dataset=eval_dataset, save_interval_epochs=1, learning_rate=0.00025, metric='VOC', save_dir='output/faster_rcnn_r50_fpn', use_vdl=True)

BUG描述:

2023-03-08 16:08:03 [INFO] Loading pretrained model from output/faster_rcnn_r50_fpn/pretrain/ResNet50_cos_pretrained.pdparams 2023-03-08 16:08:04 [WARNING] neck.fpn_inner_res2_sum_lateral.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_inner_res2_sum_lateral.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_res2_sum.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_res2_sum.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_inner_res3_sum_lateral.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_inner_res3_sum_lateral.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_res3_sum.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_res3_sum.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_inner_res4_sum_lateral.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_inner_res4_sum_lateral.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_res4_sum.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_res4_sum.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_inner_res5_sum.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_inner_res5_sum.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_res5_sum.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] neck.fpn_res5_sum.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] rpn_head.rpn_feat.rpn_conv.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] rpn_head.rpn_feat.rpn_conv.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] rpn_head.rpn_rois_score.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] rpn_head.rpn_rois_score.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] rpn_head.rpn_rois_delta.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] rpn_head.rpn_rois_delta.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] bbox_head.head.fc6.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] bbox_head.head.fc6.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] bbox_head.head.fc7.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] bbox_head.head.fc7.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] bbox_head.bbox_score.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] bbox_head.bbox_score.bias is not in pretrained model 2023-03-08 16:08:04 [WARNING] bbox_head.bbox_delta.weight is not in pretrained model 2023-03-08 16:08:04 [WARNING] bbox_head.bbox_delta.bias is not in pretrained model 2023-03-08 16:08:04 [INFO] There are 265/295 variables loaded into FasterRCNN. 2023-03-08 16:08:06 [INFO] [TRAIN] Epoch 1 finished, loss_rpn_cls=0.6888587, loss_rpn_reg=0.007873626, loss_bbox_cls=0.95418745, loss_bbox_reg=0.01720513, loss=1.6681249 . 2023-03-08 16:08:06 [INFO] Start to evaluate(total_samples=2, total_steps=2)... ---------------------------------------------------------------------------IndexError Traceback (most recent call last)/tmp/ipykernel_98/2582658238.py in 15 metric='VOC', 16 save_dir='output/faster_rcnn_r50_fpn', ---> 17 use_vdl=True) /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/models/detector.py in train(self, num_epochs, train_dataset, train_batch_size, eval_dataset, optimizer, save_interval_epochs, log_interval_steps, save_dir, pretrain_weights, learning_rate, warmup_steps, warmup_start_lr, lr_decay_epochs, lr_decay_gamma, metric, use_ema, early_stop, early_stop_patience, use_vdl, resume_checkpoint) 1376 pretrain_weights, learning_rate, warmup_steps, warmup_start_lr, 1377 lr_decay_epochs, lr_decay_gamma, metric, use_ema, early_stop, -> 1378 early_stop_patience, use_vdl, resume_checkpoint) 1379 1380 def _compose_batch_transform(self, transforms, mode='train'): /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/models/detector.py in train(self, num_epochs, train_dataset, train_batch_size, eval_dataset, optimizer, save_interval_epochs, log_interval_steps, save_dir, pretrain_weights, learning_rate, warmup_steps, warmup_start_lr, lr_decay_epochs, lr_decay_gamma, metric, use_ema, early_stop, early_stop_patience, use_vdl, resume_checkpoint) 332 early_stop=early_stop, 333 early_stop_patience=early_stop_patience, --> 334 use_vdl=use_vdl) 335 336 def quant_aware_train(self, /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/models/base.py in train_loop(self, num_epochs, train_dataset, train_batch_size, eval_dataset, save_interval_epochs, log_interval_steps, save_dir, ema, early_stop, early_stop_patience, use_vdl) 395 eval_dataset, 396 batch_size=eval_batch_size, --> 397 return_details=True) 398 # 保存最优模型 399 if local_rank == 0: /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/models/detector.py in evaluate(self, eval_dataset, batch_size, metric, return_details) 498 for step, data in enumerate(self.eval_data_loader): 499 outputs = self.run(self.net, data, 'eval') --> 500 eval_metric.update(data, outputs) 501 eval_metric.accumulate() 502 self.eval_details = eval_metric.details /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/cv/models/utils/det_metrics/metrics.py in update(self, inputs, outputs) 118 difficult) 119 self.detection_map.update(bbox, score, label, gt_box, gt_label, --> 120 difficult) 121 bbox_idx += bbox_num 122 /opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddlex/ppdet/metrics/map_utils.py in update(self, bbox, score, label, gt_box, gt_label, difficult) 231 self.class_score_poss[int(l)].append([s, 0.0]) 232 else: --> 233 self.class_score_poss[int(l)].append([s, 0.0]) 234 235 def reset(self): IndexError: list index out of range

环境

  1. 请提供您使用的PaddlePaddle和PaddleX的版本号 PaddlePaddle 2.3.2 ;paddlex==2.1.0

  2. 请提供您使用的操作系统信息,如Linux/Windows/MacOS Windows11

  3. 请问您使用的Python版本是? python 3.7

  4. 请问您使用的CUDA/cuDNN的版本号是? V100 16GB

sssdhjh avatar Mar 08 '23 08:03 sssdhjh

检查一下数据和标签是否对应,错误来看是因为有对象标签在labels.txt中找不到

lailuboy avatar Mar 09 '23 01:03 lailuboy