ChatGLM-6B icon indicating copy to clipboard operation
ChatGLM-6B copied to clipboard

[BUG/Help] <title> RuntimeError: CUBLAS error: CUBLAS_STATUS_NOT_INITIALIZED

Open JianFeiWang opened this issue 2 years ago • 1 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

Current Behavior

执行 sh train.sh 报异常

[WARNING|modeling_utils.py:3034] 2023-04-23 19:20:49,519 >> Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at chatglm-6b and are newly initialized: ['transformer.prefix_encoder.embedding.weight'] You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference. [INFO|modeling_utils.py:2690] 2023-04-23 19:20:49,569 >> Generation config file not found, using a generation config created from the model config. Quantized to 4 bit Traceback (most recent call last): File "main.py", line 431, in main() File "main.py", line 129, in main model = model.quantize(model_args.quantization_bit) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/modeling_chatglm.py", line 1434, in quantize self.transformer = quantize(self.transformer, bits, empty_init=empty_init, **kwargs) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/quantization.py", line 157, in quantize layer.attention.query_key_value = QuantizedLinear( File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/quantization.py", line 137, in init self.weight = compress_int4_weight(self.weight) File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/quantization.py", line 78, in compress_int4_weight kernels.int4WeightCompression( File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/kernels/base.py", line 48, in call func = self._prepare_func() File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/kernels/base.py", line 40, in _prepare_func self._module.get_module(), self._func_name File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/kernels/base.py", line 23, in get_module Device(curr_device).use() # force initialize context File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/device/init.py", line 152, in use self._device.use() File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/device/init.py", line 120, in use self.cublasLtHandle = cublaslt.cublasLtCreate() File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/library/base.py", line 94, in wrapper return f(*args, **kwargs) File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/library/cublaslt.py", line 105, in cublasLtCreate checkCublasStatus(cublasLt.cublasLtCreate(ctypes.byref(handle))) File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/library/cublaslt.py", line 98, in checkCublasStatus raise RuntimeError("CUBLAS error: {}".format( RuntimeError: CUBLAS error: CUBLAS_STATUS_NOT_INITIALIZED

Expected Behavior

No response

Steps To Reproduce

sh train.sh

Environment

- OS:Ubuntu 18.04.5 LTS
- Python:3.8.16
- Transformers:4.27.1
- PyTorch:1.13.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : true

Anything else?

No response

JianFeiWang avatar Apr 23 '23 11:04 JianFeiWang

The error is usually caused by running out of GPU memorg https://discuss.pytorch.org/t/cuda-error-cublas-status-not-initialized-when-calling-cublascreate-handle/125450

duzx16 avatar Apr 24 '23 03:04 duzx16