awesome-semantic-segmentation-pytorch icon indicating copy to clipboard operation
awesome-semantic-segmentation-pytorch copied to clipboard

voc2012 custom dataset train.py RuntimeError: Given groups

Open dangbinghoo opened this issue 4 years ago • 2 comments

(torchpy3) chiot@chiot-AI:~/Desktop/SemSeg/awesome-semantic-segmentation-pytorch/scripts$ python train.py --model psp --backbone resnet50 --dataset pascal_voc --lr 0.0001 --epochs 50
2020-03-02 17:38:57,152 semantic_segmentation INFO: Using 1 GPUs
2020-03-02 17:38:57,152 semantic_segmentation INFO: Namespace(aux=False, aux_weight=0.4, backbone='resnet50', base_size=520, batch_size=4, crop_size=480, dataset='pascal_voc', device='cuda', distributed=False, epochs=50, jpu=False, local_rank=0, log_dir='../runs/logs/', log_iter=10, lr=0.0001, model='psp', momentum=0.9, no_cuda=False, num_gpus=1, resume=None, save_dir='~/.torch/models', save_epoch=10, skip_val=False, start_epoch=0, use_ohem=False, val_epoch=1, warmup_factor=0.3333333333333333, warmup_iters=0, warmup_method='linear', weight_decay=0.0001, workers=4)
Found 126 images in the folder ../datasets/voc/VOC2012
Found 12 images in the folder ../datasets/voc/VOC2012
2020-03-02 17:38:59,963 semantic_segmentation INFO: Start training, Total Epochs: 50 = Total Iterations 1550
/home/chiot/Desktop/SemSeg/torchpy3/lib/python3.6/site-packages/torch/optim/lr_scheduler.py:122: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`.  Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
  "https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate", UserWarning)
Traceback (most recent call last):
  File "train.py", line 327, in <module>
    trainer.train()
  File "train.py", line 221, in train
    outputs = self.model(images)
  File "/home/chiot/Desktop/SemSeg/torchpy3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/chiot/Desktop/SemSeg/awesome-semantic-segmentation-pytorch/core/models/pspnet.py", line 44, in forward
    _, _, c3, c4 = self.base_forward(x)
  File "/home/chiot/Desktop/SemSeg/awesome-semantic-segmentation-pytorch/core/models/segbase.py", line 38, in base_forward
    x = self.pretrained.conv1(x)
  File "/home/chiot/Desktop/SemSeg/torchpy3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/chiot/Desktop/SemSeg/torchpy3/lib/python3.6/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/home/chiot/Desktop/SemSeg/torchpy3/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/chiot/Desktop/SemSeg/torchpy3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 345, in forward
    return self.conv2d_forward(input, self.weight)
  File "/home/chiot/Desktop/SemSeg/torchpy3/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 342, in conv2d_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size 64 3 3 3, expected input[4, 480, 480, 3] to have 3 channels, but got 480 channels instead

我google搜索了一番,说是image通道不正确,但不知道怎么改代码能够转换对:

pascal_voc.py 里面:
   def __getitem__(self, index):
        img = Image.open(self.images[index]).convert('RGB')
        if self.mode == 'test':
            img = self._img_transform(img)
            if self.transform is not None:
                img = self.transform(img)
            return img, os.path.basename(self.images[index])
        mask = Image.open(self.masks[index])
        # synchronized transform
        if self.mode == 'train':
            img, mask = self._sync_transform(img, mask)

dangbinghoo avatar Mar 02 '20 09:03 dangbinghoo

Please change shape [4, 480, 480, 3] to [4, 3, 480, 480]. Just like: image = image.transpose(0, 3, 1, 2).

Tramac avatar Mar 02 '20 09:03 Tramac

Please change shape [4, 480, 480, 3] to [4, 3, 480, 480]. Just like: image = image.transpose(0, 3, 1, 2).

it seems torch.transpose only support 2 arguments, and I just added:

images = images.transpose(1, 3)

after

            images = images.to(self.device)
            targets = targets.to(self.device)

and validation method in train.py also has such a problem, I added :

image = image.transpose(1, 3)

after target = target.to(self.device) and the trainning now seems to going on, but I don't know whether it's the correct way.

Thanks!

dangbinghoo avatar Mar 03 '20 08:03 dangbinghoo