ChatGLM-6B [BUG/Help] <title> RuntimeError: Library cudart is not initialized

Is there an existing issue for this?

[X] I have searched the existing issues

Current Behavior

Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a configuration with custom code to ensure no malicious code has been contributed in a newer revision. Explicitly passing a revision is encouraged when loading a model with custom code to ensure no malicious code has been contributed in a newer revision. Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:09<00:00, 1.14s/it] ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /data/text2music/ChatGLM-6B/cli_demo1.py:5 in │ │ │ │ 2 from transformers import AutoTokenizer, AutoModel │ │ 3 │ │ 4 tokenizer = AutoTokenizer.from_pretrained("/data/text2music/ChatGLM-6B/local", trust_rem │ │ ❱ 5 model = AutoModel.from_pretrained("/data/text2music/ChatGLM-6B/local", trust_remote_code │ │ 6 model = model.eval() │ │ 7 │ │ 8 history = [] │ │ │ │ /home/user_00/.cache/huggingface/modules/transformers_modules/local/modeling_chatglm.py:1154 in │ │ quantize │ │ │ │ 1151 │ │ │ 1152 │ def quantize(self, bits: int): │ │ 1153 │ │ from .quantization import quantize │ │ ❱ 1154 │ │ self.transformer = quantize(self.transformer, bits) │ │ 1155 │ │ return self │ │ 1156 │ │ │ │ /home/user_00/.cache/huggingface/modules/transformers_modules/local/quantization.py:147 in │ │ quantize │ │ │ │ 144 │ """Replace fp16 linear with quantized linear""" │ │ 145 │ │ │ 146 │ for layer in model.layers: │ │ ❱ 147 │ │ layer.attention.query_key_value = QuantizedLinear( │ │ 148 │ │ │ weight_bit_width=weight_bit_width, │ │ 149 │ │ │ weight_tensor=layer.attention.query_key_value.weight.to(torch.cuda.current_d │ │ 150 │ │ │ bias_tensor=layer.attention.query_key_value.bias, │ │ │ │ /home/user_00/.cache/huggingface/modules/transformers_modules/local/quantization.py:130 in │ │ init │ │ │ │ 127 │ │ │ self.weight_scale = (weight_tensor.abs().max(dim=-1).values / ((2 ** (weight │ │ 128 │ │ │ self.weight = torch.round(weight_tensor / self.weight_scale[:, None]).to(tor │ │ 129 │ │ │ if weight_bit_width == 4: │ │ ❱ 130 │ │ │ │ self.weight = compress_int4_weight(self.weight) │ │ 131 │ │ │ │ 132 │ │ self.weight = Parameter(self.weight.to(kwargs["device"]), requires_grad=False) │ │ 133 │ │ self.weight_scale = Parameter(self.weight_scale.to(kwargs["device"]), requires_g │ │ │ │ /home/user_00/.cache/huggingface/modules/transformers_modules/local/quantization.py:71 in │ │ compress_int4_weight │ │ │ │ 68 │ │ gridDim = (n, 1, 1) │ │ 69 │ │ blockDim = (min(round_up(m, 32), 1024), 1, 1) │ │ 70 │ │ │ │ ❱ 71 │ │ kernels.int4WeightCompression( │ │ 72 │ │ │ gridDim, │ │ 73 │ │ │ blockDim, │ │ 74 │ │ │ 0, │ │ │ │ /data/miniconda3/envs/GLM/lib/python3.8/site-packages/cpm_kernels/kernels/base.py:48 in call │ │ │ │ 45 │ │ │ sharedMemBytes : int, stream : cudart.cudaStream_t, params : List[Any] ) -> │ │ 46 │ │ assert len(gridDim) == 3 │ │ 47 │ │ assert len(blockDim) == 3 │ │ ❱ 48 │ │ func = self._prepare_func() │ │ 49 │ │ │ │ 50 │ │ cuda.cuLaunchKernel(func, │ │ 51 │ │ │ gridDim[0], gridDim[1], gridDim[2], │ │ │ │ /data/miniconda3/envs/GLM/lib/python3.8/site-packages/cpm_kernels/kernels/base.py:36 in │ │ _prepare_func │ │ │ │ 33 │ │ self._func_name = func_name │ │ 34 │ │ │ 35 │ def _prepare_func(self): │ │ ❱ 36 │ │ curr_device = cudart.cudaGetDevice() │ │ 37 │ │ cudart.cudaSetDevice(curr_device) # ensure cudart context │ │ 38 │ │ if curr_device not in self._funcs: │ │ 39 │ │ │ self._funcs[curr_device] = cuda.cuModuleGetFunction( │ │ │ │ /data/miniconda3/envs/GLM/lib/python3.8/site-packages/cpm_kernels/library/base.py:72 in wrapper │ │ │ │ 69 │ │ │ def decorator(f): │ │ 70 │ │ │ │ @wraps(f) │ │ 71 │ │ │ │ def wrapper(*args, **kwargs): │ │ ❱ 72 │ │ │ │ │ raise RuntimeError("Library %s is not initialized" % self.__name) │ │ 73 │ │ │ │ return wrapper │ │ 74 │ │ │ return decorator │ │ 75 │ │ else: │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ RuntimeError: Library cudart is not initialized

Expected Behavior

I just use the quantize function, to convert the model into int4. However, this exception appear. How could I fix this bug to successfully quantize this ChatGLM-6B?

Steps To Reproduce

import os from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("/data/text2music/ChatGLM-6B/local", trust_remote_code=True) model = AutoModel.from_pretrained("/data/text2music/ChatGLM-6B/local", trust_remote_code=True).half().quantize(4).cuda(device=2)

Environment

- OS: Ubuntu 20.04
- Python: 3.7
- Transformers: 4.26.1
- PyTorch: 1.13
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : True

Anything else?

No response

Mar 17 '23 02:03 rogerrojur

same problem. have you solved this?

Mar 17 '23 03:03 Adenialzz

检查本机cuda的安装是否正确，或者尝试添加下path到cuda的bin目录我重装了cuda，设置了path后，问题解决，正常运行

Mar 17 '23 10:03 188080501

添加下path到cuda的bin目录，请问是什么path，项目path吗？

Mar 18 '23 05:03 Chenny0808

同样的问题

Mar 23 '23 04:03 mh739025250

首先，在环境里找到torch库内nvrtc开头的一个链接库文件，比如我的是在windows平台、miniconda的环境里的C:\ProgramData\miniconda3\envs\ChatGLM-6B\Lib\site-packages\torch\lib\nvrtc64_112_0.dll路径。不同平台应该都有所不同。
把这个文件所在目录加到PATH里。如果不希望污染操作系统的PATH，可以直接在开头import os之后直接加进去，例如： os.environ['PATH'] = os.environ.get("PATH", "") + os.pathsep + r'C:\ProgramData\miniconda3\envs\ChatGLM-6B\Lib\site-packages\torch\lib'
然后打开应该就可以了。

English version(Translated by ChatGPT):

First, find a library file starting with "nvrtc" in the torch library in your environment. For example, mine is located at the path C:\ProgramData\miniconda3\envs\ChatGLM-6B\Lib\site-packages\torch\lib\nvrtc64_112_0.dll in a Windows platform with miniconda installed. The path may differ for different platforms.
Add the directory where the file is located to your PATH. If you don't want to modify the PATH of your operating system, you can directly add it after importing os. For example: os.environ['PATH'] = os.environ.get("PATH", "") + os.pathsep + r'C:\ProgramData\miniconda3\envs\ChatGLM-6B\Lib\site-packages\torch\lib'
After doing this, it should work fine.

Mar 23 '23 08:03 AnduFalaH

如果用的是conda管理环境：
首先用conda list | grep cuda确定该环境cuda运行时版本，如11.7。
然后从nvidia源安装cudatoolkit：

conda install cudatoolkit=11.7 -c nvidia

Mar 24 '23 13:03 mjysci

如果用的是conda管理环境：首先用conda list | grep cuda确定该环境cuda运行时版本，如11.7。然后从nvidia源安装cudatoolkit：
conda install cudatoolkit=11.7 -c nvidia

实测可以解决问题，环境

Windows 11 + WSL2 Debian
pytorch==2.0.0
transformers==4.26.1

Mar 24 '23 17:03 LucienShui

如果用的是conda管理环境：首先用conda list | grep cuda确定该环境cuda运行时版本，如11.7。然后从nvidia源安装cudatoolkit：
conda install cudatoolkit=11.7 -c nvidia

it works, :)

Mar 28 '23 14:03 RRRoger

我在wsl2里面也遇到了相同的问题，按照微软的推荐未在wsl中设置任何cuda tookit，出现了上述错误“[RuntimeError: Library cudart is not initialized]"

Apr 01 '23 12:04 judgementc

我也是一样的问题，上面讲我看都是扯淡，压根就不是环境问题好么，怎么解决？？？？？？？？？？？：好郁闷，写了几行代码这么多兼容问题~~

Apr 03 '23 07:04 gg22mm

我也遇到这个问题，找不到解决思路。目前通过在train的时候去掉 --quantization_bit 4 这个选项，放弃4bit量化可以跑通。

Apr 03 '23 22:04 flyingtimes

The same issue. How to fix it in ubuntu OS?

Apr 04 '23 11:04 weiliswen

目前通过在train的时候去掉 --quantization_bit 4 这个选项，放弃4bit量化可以跑通。

说得对去掉--quantization_bit 4 确实是没这个报错了，不知道官方有没有发现？

Apr 14 '23 08:04 gg22mm

还有就是预测也是一样的问题，预测还没没有这个参数

Apr 14 '23 08:04 gg22mm

很肯能是cuda版本和pytorch对应的cuda版本不同，我在windows安装的cuda版本是12，安装pytorch对应的cuda版本是11.8，然后就报了错，卸载cuda后安装11.8的cuda就可以了

Apr 18 '23 00:04 yuquant

我也是一样的问题，上面讲我看都是扯淡，压根就不是环境问题好么，怎么解决？？？？？？？？？？？：好郁闷，写了几行代码这么多兼容问题~~

me too

Apr 18 '23 05:04 SeekPoint

我也是一样的问题，上面讲我看都是扯淡，压根就不是环境问题好么，怎么解决？？？？？？？？？？？：好郁闷，写了几行代码这么多兼容问题~~

me too

把--quantization_bit 4去掉试试

Apr 18 '23 08:04 529106896

还有就是预测也是一样的问题，预测还没没有这个参数

推理时确实出现这个问题，我装了cudatoolkit也不行

Apr 19 '23 02:04 bingoohe

这个问题是因为缺少必要的动态库导致的，Ubuntu 22.04 下执行

sudo apt install libcudart11.0 libcublaslt11

其他 Linux 环境可以参考查找对应的库解决

Apr 19 '23 14:04 Richard-Ni

首先，在环境里找到torch库内nvrtc开头的一个链接库文件，比如我的是在windows平台、miniconda的环境里的C:\ProgramData\miniconda3\envs\ChatGLM-6B\Lib\site-packages\torch\lib\nvrtc64_112_0.dll路径。不同平台应该都有所不同。

把这个文件所在目录加到PATH里。如果不希望污染操作系统的PATH，可以直接在开头import os之后直接加进去，例如： os.environ['PATH'] = os.environ.get("PATH", "") + os.pathsep + r'C:\ProgramData\miniconda3\envs\ChatGLM-6B\Lib\site-packages\torch\lib'

然后打开应该就可以了。

English version(Translated by ChatGPT):

First, find a library file starting with "nvrtc" in the torch library in your environment. For example, mine is located at the path C:\ProgramData\miniconda3\envs\ChatGLM-6B\Lib\site-packages\torch\lib\nvrtc64_112_0.dll in a Windows platform with miniconda installed. The path may differ for different platforms.

Add the directory where the file is located to your PATH. If you don't want to modify the PATH of your operating system, you can directly add it after importing os. For example: os.environ['PATH'] = os.environ.get("PATH", "") + os.pathsep + r'C:\ProgramData\miniconda3\envs\ChatGLM-6B\Lib\site-packages\torch\lib'

After doing this, it should work fine.

这个方法对我环境管用的，另外顺便提供一个通用代码：

import pkg_resources
import os
os.environ['PATH'] = os.environ.get("PATH", "") + os.pathsep + pkg_resources.resource_filename('torch', 'lib')

Apr 26 '23 01:04 l3yx

这个问题是因为缺少必要的动态库导致的，Ubuntu 22.04 下执行
sudo apt install libcudart11.0 libcublaslt11
其他 Linux 环境可以参考查找对应的库解决

这个管用

May 15 '23 03:05 siyuan163

conda环境里安装你cuda 对应版本的 cuda-toolkit ，比如我是最新的cuda 12.1 conda install -c "nvidia/label/cuda-12.1.1" cuda-toolkit https://anaconda.org/nvidia/cuda-toolkit

May 25 '23 06:05 linuxdevopscn

@weiliswen

I tried the same way on ubuntu conda install cudatoolkit=11.8 -c nvidia

working for me

May 25 '23 10:05 GoldExperience

是这样的，直接搞定。另外我的ubuntu 22.04还遇到了gcc编译时候问题 crti.o no such file or directory 用这样: sudo apt install libc6=2.35-0ubuntu3 sudo apt install libc6-dev

Jun 01 '23 10:06 murainwood

Linux 下可能可以这样解决，参考： Support loading cuda libraries from nvidia package. https://github.com/OpenBMB/cpm_kernels/pull/8

Jun 14 '23 04:06 codingfun2022

这个问题是因为缺少必要的动态库导致的，Ubuntu 22.04 下执行
sudo apt install libcudart11.0 libcublaslt11
其他 Linux 环境可以参考查找对应的库解决

有效，十分感谢

Jun 15 '23 16:06 jushe

这个问题是因为缺少必要的动态库导致的，Ubuntu 22.04 下执行
sudo apt install libcudart11.0 libcublaslt11
其他 Linux 环境可以参考查找对应的库解决

The same env and encounter the same problem, and it works for me. Thanks.

Jun 26 '23 09:06 ablozhou

这个问题是因为缺少必要的动态库导致的，Ubuntu 22.04 下执行
sudo apt install libcudart11.0 libcublaslt11
其他 Linux 环境可以参考查找对应的库解决

正解！如果是 Ubuntu 20.04，执行： sudo apt install libcudart10.1 libcublaslt10

Jul 01 '23 05:07 njutsiang

首先，在环境里找到torch库内nvrtc开头的一个链接库文件，比如我的是在windows平台、miniconda的环境里的C:\ProgramData\miniconda3\envs\ChatGLM-6B\Lib\site-packages\torch\lib\nvrtc64_112_0.dll路径。不同平台应该都有所不同。

把这个文件所在目录加到PATH里。如果不希望污染操作系统的PATH，可以直接在开头import os之后直接加进去，例如： os.environ['PATH'] = os.environ.get("PATH", "") + os.pathsep + r'C:\ProgramData\miniconda3\envs\ChatGLM-6B\Lib\site-packages\torch\lib'

然后打开应该就可以了。

English version(Translated by ChatGPT):

First, find a library file starting with "nvrtc" in the torch library in your environment. For example, mine is located at the path C:\ProgramData\miniconda3\envs\ChatGLM-6B\Lib\site-packages\torch\lib\nvrtc64_112_0.dll in a Windows platform with miniconda installed. The path may differ for different platforms.

Add the directory where the file is located to your PATH. If you don't want to modify the PATH of your operating system, you can directly add it after importing os. For example: os.environ['PATH'] = os.environ.get("PATH", "") + os.pathsep + r'C:\ProgramData\miniconda3\envs\ChatGLM-6B\Lib\site-packages\torch\lib'

After doing this, it should work fine.

这个方法对我环境管用的，另外顺便提供一个通用代码：
import pkg_resources
import os
os.environ['PATH'] = os.environ.get("PATH", "") + os.pathsep + pkg_resources.resource_filename('torch', 'lib')

这个解决了我的问题

Jul 07 '23 03:07 KelvinJhu

这个问题是因为缺少必要的动态库导致的，Ubuntu 22.04 下执行
sudo apt install libcudart11.0 libcublaslt11
其他 Linux 环境可以参考查找对应的库解决
正解！如果是 Ubuntu 20.04，执行： sudo apt install libcudart10.1 libcublaslt10

版本要匹配，否则nvidia-smi 会出现 Failed to initialize NVML: Driver/library version mismatch 的问题

Jul 11 '23 02:07 yzbx

ChatGLM-6B ChatGLM-6B copied to clipboard

[BUG/Help] <title> RuntimeError: Library cudart is not initialized

Is there an existing issue for this?

Current Behavior

Expected Behavior

Steps To Reproduce

Environment

Anything else?

ChatGLM-6B
ChatGLM-6B copied to clipboard