text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)`

Open mindcz opened this issue 2 years ago • 14 comments
trafficstars

Hi guys, First of all thank you for your work. I had tried to setup webui and its starts fine. But any request to generate ends with the same error. Any ideas how to fix it? image

mindcz avatar Mar 13 '23 06:03 mindcz

even after few successful phrases it fall down with the same error image

mindcz avatar Mar 13 '23 10:03 mindcz

Those are two different errors. I don't know what the first is, but the second means that your GPU ran out of memory.

oobabooga avatar Mar 13 '23 22:03 oobabooga

I'm receiving what looks like the same error:

bash start-webui.sh Loading the extension "gallery"... Ok. The following models are available:

  1. gpt-j-6B
  2. gpt4chan_model_float16
  3. opt-1.3b
  4. opt-2.7b
  5. pygmalion-6b

Which one do you want to load? 1-5

2

Loading gpt4chan_model_float16... Auto-assiging --gpu-memory 11 for your GPU to try to prevent out-of-memory errors. You can manually set other values. Loaded the model in 41.19 seconds. Running on local URL: http://127.0.0.1:7860

To create a public link, set share=True in launch(). Exception in thread Thread-3 (gentask): Traceback (most recent call last): File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/threading.py", line 953, in run self._target(*self._args, **self._kwargs) File "/home/robert/one-click-installers-oobabooga-linux/text-generation-webui/modules/callbacks.py", line 64, in gentask ret = self.mfunc(callback=_callback, **self.kwargs) File "/home/robert/one-click-installers-oobabooga-linux/text-generation-webui/modules/text_generation.py", line 196, in generate_with_callback shared.model.generate(**kwargs) File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/transformers/generation/utils.py", line 1452, in generate return self.sample( File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/transformers/generation/utils.py", line 2468, in sample outputs = self( File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py", line 838, in forward transformer_outputs = self.transformer( File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py", line 671, in forward outputs = block( File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py", line 301, in forward attn_outputs = self.attn( File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/transformers/models/gptj/modeling_gptj.py", line 202, in forward query = self.q_proj(hidden_states) File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl return forward_call(*input, **kwargs) File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward output = old_forward(*args, **kwargs) File "/home/robert/one-click-installers-oobabooga-linux/installer_files/env/lib/python3.10/site-packages/torch/nn/modules/linear.py", line 114, in forward return F.linear(input, self.weight, self.bias) RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling cublasCreate(handle)

bitshifter52 avatar Mar 15 '23 17:03 bitshifter52

Fixed this by reducing the VRAM with the --gpu-memory flag by one gigabyte.

RazeLighter777 avatar Mar 19 '23 22:03 RazeLighter777

It is now possible to set fractional --gpu-memory values too like --gpu-memory 3400MiB.

oobabooga avatar Mar 19 '23 22:03 oobabooga

None of the suggestions are working for me, I still get the same error as OP.

RaymondTracer avatar Mar 21 '23 04:03 RaymondTracer

Same here. Using LLaMA-7B

image

and whenever I try to generate anything:

image

the entire GPU crashes and I get this error:

image

I have --gpu-memory set to 8 with 12GB of VRAM

LoganDark avatar Mar 27 '23 04:03 LoganDark

mine was caused by a too-high GPU overclock, nice

LoganDark avatar Mar 27 '23 05:03 LoganDark

I had reinstalled new version and now its wont start at all Starting the web UI... Warning: --cai-chat is deprecated. Use --chat instead. Traceback (most recent call last): File "C:\Distr\oobabooga-windows\text-generation-webui\server.py", line 18, in from modules import chat, shared, training, ui, api File "C:\Distr\oobabooga-windows\text-generation-webui\modules\chat.py", line 15, in from modules.html_generator import (fix_newlines, chat_html_wrapper, File "C:\Distr\oobabooga-windows\text-generation-webui\modules\html_generator.py", line 12, in import markdown ModuleNotFoundError: No module named 'markdown' Press any key to continue . . .

mindcz avatar Apr 06 '23 10:04 mindcz

Try replacing your install.bat with the updated one and re-running it https://github.com/oobabooga/one-click-installers/

oobabooga avatar Apr 06 '23 17:04 oobabooga

After the update I get

CUDA SETUP: Required library version not found: libsbitsandbytes_cpu.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...
argument of type 'WindowsPath' is not iterable
CUDA SETUP: Required library version not found: libsbitsandbytes_cpu.so. Maybe you need to compile it from source?
CUDA SETUP: Defaulting to libbitsandbytes_cpu.so...
argument of type 'WindowsPath' is not iterable
C:\Users\user\Downloads\one-click-installers-oobabooga-windows\installer_files\env\lib\site-packages\bitsandbytes\cextension.py:31: UserWarning: The installed version of bitsandbytes was compiled without GPU support. 8-bit optimizers and GPU quantization are unavailable.
  warn("The installed version of bitsandbytes was compiled without GPU support. "

and

Warning: torch.cuda.is_available() returned False.
This means that no GPU has been detected.
Falling back to CPU mode.

RaymondTracer avatar Apr 07 '23 00:04 RaymondTracer

Try replacing your install.bat with the updated one and re-running it https://github.com/oobabooga/one-click-installers/

for now its new problem) I have conda installed And nothing was changed from my side but now i Cant runt install image

mindcz avatar Apr 10 '23 06:04 mindcz

I have the same error (the 1st one) CUBLAS_STATUS_NOT_INITIALIZED when trying to launch a LLama Model.

RandomLegend avatar Apr 18 '23 18:04 RandomLegend

@ALL I was having this error when launching multiple instances of LLM on GPU. If running textgen alone, I'm fine

yhyu13 avatar Apr 29 '23 16:04 yhyu13

Does the project require a certain version of NVIDIA CUDA? I'm running 12.1. I'm getting the error trying a few 13B models, but not TheBloke_vicuna-7B-1.1-GPTQ-4bit-128g

edit: Solved by just not using other similar looking models on huggingface. Using OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5 with some of the options to reduce and control amount of VRAM used works for me.

jtara1 avatar Jun 14 '23 00:06 jtara1

This issue has been closed due to inactivity for 6 weeks. If you believe it is still relevant, please leave a comment below. You can tag a developer in your comment.

github-actions[bot] avatar Dec 06 '23 23:12 github-actions[bot]

Fixed this by reducing the VRAM with the --gpu-memory flag by one gigabyte.

@RazeLighter777 ,

I'm currently having this same issue and came across this thread while searching for a solution. Exactly how did you reduce the memory with --gpu_memory command? Thanks.

lorsonblair avatar Apr 24 '24 15:04 lorsonblair