ATVGnet icon indicating copy to clipboard operation
ATVGnet copied to clipboard

error when run demo.py

Open TTTJJJWWW opened this issue 5 years ago • 9 comments

Traceback (most recent call last): File "demo.py", line 271, in test() File "demo.py", line 178, in test encoder = encoder.cuda() File "/home/iie/.conda/envs/s2v/lib/python2.7/site-packages/torch/nn/modules/module.py", line 258, in cuda return self._apply(lambda t: t.cuda(device)) File "/home/iie/.conda/envs/s2v/lib/python2.7/site-packages/torch/nn/modules/module.py", line 185, in _apply module._apply(fn) File "/home/iie/.conda/envs/s2v/lib/python2.7/site-packages/torch/nn/modules/rnn.py", line 112, in _apply self.flatten_parameters() File "/home/iie/.conda/envs/s2v/lib/python2.7/site-packages/torch/nn/modules/rnn.py", line 105, in flatten_parameters self.batch_first, bool(self.bidirectional)) RuntimeError: CuDNN error: CUDNN_STATUS_SUCCESS

TTTJJJWWW avatar Jun 19 '19 06:06 TTTJJJWWW

it's your CUDA problem. what's your environment ?

kduy avatar Jun 19 '19 08:06 kduy

Ubuntu16.04 cudnn=7.1.3 cuda=9.0 pytorch=0.4.1 torchvision=0.2.1 python=2.7

TTTJJJWWW avatar Jun 20 '19 06:06 TTTJJJWWW

It should be your cuda problem. Please check if your cudnn version, cuda and pytorch version.

lelechen63 avatar Jun 21 '19 18:06 lelechen63

Maybe, but i have no idea to fix this. I think the version is ok, how to change that? "Ubuntu16.04 cudnn=7.1.3 cuda=9.0 pytorch=0.4.1 torchvision=0.2.1 python=2.7"

TTTJJJWWW avatar Jun 24 '19 01:06 TTTJJJWWW

Try to add this line in models.py

torch.backends.cudnn.enabled=False

IQ17 avatar Mar 24 '20 13:03 IQ17

we are facing same problem when we run cuda 11 with python 3 would you please help?

Mora-max avatar Mar 13 '21 18:03 Mora-max

Try to add this line in models.py

torch.backends.cudnn.enabled=False

This worked, but now I'm running into this issue:

=======================================
Start to generate images
Traceback (most recent call last):
  File "demo.py", line 272, in <module>
    test()
  File "demo.py", line 235, in test
    fake_lmark = encoder(example_landmark, input_mfcc)
  File "/home/arta/anaconda3/envs/py2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/arta/ATVGnet/code/models.py", line 54, in forward
    example_landmark_f = self.lmark_encoder(example_landmark)
  File "/home/arta/anaconda3/envs/py2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/arta/anaconda3/envs/py2/lib/python2.7/site-packages/torch/nn/modules/container.py", line 91, in forward
    input = module(input)
  File "/home/arta/anaconda3/envs/py2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/arta/anaconda3/envs/py2/lib/python2.7/site-packages/torch/nn/modules/linear.py", line 55, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/arta/anaconda3/envs/py2/lib/python2.7/site-packages/torch/nn/functional.py", line 1024, in linear
    return torch.addmm(bias, input, weight.t())
RuntimeError: cublas runtime error : the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1535488076166/work/aten/src/THC/THCBlas.cu:249

Many of the the others online who report having this problem are using Nvidia 2080Ti with CUDA less than 10, whereas I'm using a 3080 with CUDA 11.2.

I am running Ubuntu 20.04, CUDNN 8, CUDA 11.2, python 2.7.

aseyedia avatar Sep 30 '21 02:09 aseyedia

Try to add this line in models.py torch.backends.cudnn.enabled=False

This worked, but now I'm running into this issue:

=======================================
Start to generate images
Traceback (most recent call last):
  File "demo.py", line 272, in <module>
    test()
  File "demo.py", line 235, in test
    fake_lmark = encoder(example_landmark, input_mfcc)
  File "/home/arta/anaconda3/envs/py2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/arta/ATVGnet/code/models.py", line 54, in forward
    example_landmark_f = self.lmark_encoder(example_landmark)
  File "/home/arta/anaconda3/envs/py2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/arta/anaconda3/envs/py2/lib/python2.7/site-packages/torch/nn/modules/container.py", line 91, in forward
    input = module(input)
  File "/home/arta/anaconda3/envs/py2/lib/python2.7/site-packages/torch/nn/modules/module.py", line 477, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/arta/anaconda3/envs/py2/lib/python2.7/site-packages/torch/nn/modules/linear.py", line 55, in forward
    return F.linear(input, self.weight, self.bias)
  File "/home/arta/anaconda3/envs/py2/lib/python2.7/site-packages/torch/nn/functional.py", line 1024, in linear
    return torch.addmm(bias, input, weight.t())
RuntimeError: cublas runtime error : the GPU program failed to execute at /opt/conda/conda-bld/pytorch_1535488076166/work/aten/src/THC/THCBlas.cu:249

Many of the the others online who report having this problem are using Nvidia 2080Ti with CUDA less than 10, whereas I'm using a 2080 with CUDA 11.2.

I am running Ubuntu 20.04, CUDNN 8, CUDA 11.2, python 2.7.

Did you ever solve this? Im getting the same error on windows 10, CUDA 11.2, python 3.7

DanBigioi avatar Apr 04 '22 17:04 DanBigioi

@DanBigioi I don't remember. I did have a typo though; I am using a 3080 and not a 2080. I think I settled on not having enough memory.

aseyedia avatar Apr 04 '22 19:04 aseyedia