ssd.pytorch icon indicating copy to clipboard operation
ssd.pytorch copied to clipboard

RuntimeError: cuda runtime error (11) :invalid argument at /pytorch/aten/src/THC/THCGeneral.cpp:383

Open ouening opened this issue 4 years ago • 2 comments

OS: Ubuntu python: 3.6 cuda:9.0 pytorch:1.1.0(GPU) GPU: RTX2080(8g)

When I execute the command: python3 train.py --dataset VOC --dataset_root /media/gaoya/disk/Datasets/VOCdevkit/ --basenet vgg16_reducedfc.pth --batch_size 8

Error happened:

Loading base network...
Initializing weights...
train.py:217: UserWarning: nn.init.xavier_uniform is now deprecated in favor of nn.init.xavier_uniform_.
  init.xavier_uniform(param)
Loading the dataset...
Training SSD on: VOC0712
Using the specified args:
Namespace(basenet='vgg16_reducedfc.pth', batch_size=8, cuda=True, dataset='VOC', dataset_root='/media/gaoya/disk/Datasets/VOCdevkit/', gamma=0.1, lr=0.001, momentum=0.9, num_workers=4, resume=None, save_folder='weights/', start_iter=0, visdom=False, weight_decay=0.0005)
train.py:172: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  targets = [Variable(ann.cuda(), volatile=True) for ann in targets]
THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=383 error=11 : invalid argument
Traceback (most recent call last):
  File "train.py", line 258, in <module>
    train()
  File "train.py", line 178, in train
    out = net(images)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/parallel/data_parallel.py", line 152, in forward
    outputs = self.parallel_apply(replicas, inputs, kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/parallel/data_parallel.py", line 162, in parallel_apply
    return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/parallel/parallel_apply.py", line 83, in parallel_apply
    raise output
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/parallel/parallel_apply.py", line 59, in _worker
    output = module(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/gaoya/Files/python/pytorch/ssd.pytorch-master/ssd.py", line 75, in forward
    x = self.vgg[k](x)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py", line 338, in forward
    self.padding, self.dilation, self.groups)
**_RuntimeError: cuda runtime error (11) : invalid argument at /pytorch/aten/src/THC/THCGeneral.cpp:383_**

Can someone fix it out?

ouening avatar Nov 03 '19 04:11 ouening

I meet the same problem!!! Did you fix it??

vanellope666 avatar Apr 20 '21 10:04 vanellope666

oh I finally fix it by update torchvision from 0.3.0 to 0.4.0!

vanellope666 avatar Apr 20 '21 11:04 vanellope666