GPT2-Chinese icon indicating copy to clipboard operation
GPT2-Chinese copied to clipboard

RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)`

Open shjliqian opened this issue 4 years ago • 8 comments

您好,可否请教下,下面报错的原因是什么

File "train.py", line 202, in main outputs = model.forward(input_ids=batch_inputs, labels=batch_inputs) File "/home/nlp/anaconda3/envs/py36/lib/python3.6/site-packages/transformers/modeling_gpt2.py", line 533, in forward head_mask=head_mask) File "/home/nlp/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, **kwargs) File "/home/nlp/anaconda3/envs/py36/lib/python3.6/site-packages/transformers/modeling_gpt2.py", line 441, in forward head_mask=head_mask[i]) File "/home/nlp/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, **kwargs) File "/home/nlp/anaconda3/envs/py36/lib/python3.6/site-packages/transformers/modeling_gpt2.py", line 231, in forward head_mask=head_mask) File "/home/nlp/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, **kwargs) File "/home/nlp/anaconda3/envs/py36/lib/python3.6/site-packages/transformers/modeling_gpt2.py", line 181, in forward x = self.c_attn(x) File "/home/nlp/anaconda3/envs/py36/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in call result = self.forward(*input, **kwargs) File "/home/nlp/anaconda3/envs/py36/lib/python3.6/site-packages/transformers/modeling_utils.py", line 440, in forward x = torch.addmm(self.bias, x.view(-1, x.size(-1)), self.weight) RuntimeError: CUDA error: CUBLAS_STATUS_ALLOC_FAILED when calling `cublasCreate(handle)

shjliqian avatar Apr 10 '20 12:04 shjliqian

你解决了吗?我也报这个错

81549361 avatar Apr 14 '20 14:04 81549361

我也遇到过,参数问题。

movecpp avatar Aug 28 '20 12:08 movecpp

我RuntimeError: CUDA error好想有点不一样,减少了--batch_size 好了

xuxiaoyaoo avatar Oct 20 '20 05:10 xuxiaoyaoo

我也遇到过,参数问题。

请问你后来解决了吗?

tobran avatar Dec 17 '20 14:12 tobran

我也遇到过,参数问题。

请问你后来解决了吗?

当时就解决了,调一下模型参数就可以。

movecpp avatar Dec 18 '20 15:12 movecpp

我也遇到过,参数问题。

请问你后来解决了吗?

当时就解决了,调一下模型参数就可以。

可以请问下你是具体怎么修改的参数吗

baoyu-yuan avatar Jan 12 '21 09:01 baoyu-yuan

我也遇到过,参数问题。

请问你后来解决了吗?

当时就解决了,调一下模型参数就可以。

可以请问下你是具体怎么修改的参数吗 与ctx以及layer层数相关,具体情况很玄学,总之修改这两个。

movecpp avatar Jan 12 '21 16:01 movecpp

Installing torch 1.8.1 with CUDA 11.1 works for me:

pip3 install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/lts/1.8/torch_lts.html

(Copied from PyTorch website)

hughplay avatar Jul 05 '21 13:07 hughplay