Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
执行 sh train.sh 报异常
[WARNING|modeling_utils.py:3034] 2023-04-23 19:20:49,519 >> Some weights of ChatGLMForConditionalGeneration were not initialized from the model checkpoint at chatglm-6b and are newly initialized: ['transformer.prefix_encoder.embedding.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
[INFO|modeling_utils.py:2690] 2023-04-23 19:20:49,569 >> Generation config file not found, using a generation config created from the model config.
Quantized to 4 bit
Traceback (most recent call last):
File "main.py", line 431, in
main()
File "main.py", line 129, in main
model = model.quantize(model_args.quantization_bit)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/modeling_chatglm.py", line 1434, in quantize
self.transformer = quantize(self.transformer, bits, empty_init=empty_init, **kwargs)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/quantization.py", line 157, in quantize
layer.attention.query_key_value = QuantizedLinear(
File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/quantization.py", line 137, in init
self.weight = compress_int4_weight(self.weight)
File "/root/.cache/huggingface/modules/transformers_modules/chatglm-6b/quantization.py", line 78, in compress_int4_weight
kernels.int4WeightCompression(
File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/kernels/base.py", line 48, in call
func = self._prepare_func()
File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/kernels/base.py", line 40, in _prepare_func
self._module.get_module(), self._func_name
File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/kernels/base.py", line 23, in get_module
Device(curr_device).use() # force initialize context
File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/device/init.py", line 152, in use
self._device.use()
File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/device/init.py", line 120, in use
self.cublasLtHandle = cublaslt.cublasLtCreate()
File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/library/base.py", line 94, in wrapper
return f(*args, **kwargs)
File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/library/cublaslt.py", line 105, in cublasLtCreate
checkCublasStatus(cublasLt.cublasLtCreate(ctypes.byref(handle)))
File "/home/xiezizhe/anaconda3090/envs/glm/lib/python3.8/site-packages/cpm_kernels/library/cublaslt.py", line 98, in checkCublasStatus
raise RuntimeError("CUBLAS error: {}".format(
RuntimeError: CUBLAS error: CUBLAS_STATUS_NOT_INITIALIZED
Expected Behavior
No response
Steps To Reproduce
sh train.sh
Environment
- OS:Ubuntu 18.04.5 LTS
- Python:3.8.16
- Transformers:4.27.1
- PyTorch:1.13.1
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) : true
Anything else?
No response
The error is usually caused by running out of GPU memorg
https://discuss.pytorch.org/t/cuda-error-cublas-status-not-initialized-when-calling-cublascreate-handle/125450