MOSS icon indicating copy to clipboard operation
MOSS copied to clipboard

torch1.10.1,cuda11.3,推理时报错RuntimeError: CUDA error:no kernel image...是因为显存不够吗,3080显卡

Open Ben2522662 opened this issue 1 year ago • 1 comments

python moss_cli_demo.py Fetching 17 files: 100%|██████████████████████████████████████████████████████████████████████████████████| 17/17 [00:00<00:00, 83787.51it/s] Waiting for all devices to be ready, it may take a few minutes... 欢迎使用 MOSS 人工智能助手!输入内容即可进行对话。输入 clear 以清空对话历史,输入 stop 以终止对话。 <|Human|>: hello Traceback (most recent call last): File "moss_cli_demo.py", line 89, in main() File "moss_cli_demo.py", line 72, in main outputs = model.generate( File "/root/miniconda3/envs/moss2/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/root/miniconda3/envs/moss2/lib/python3.8/site-packages/transformers/generation/utils.py", line 1358, in generate if pad_token_id is not None and torch.sum(inputs_tensor[:, -1] == pad_token_id) > 0: RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Ben2522662 avatar Apr 27 '23 01:04 Ben2522662

requirement.txt中要求pytorch==1.13.1,这个py版本好像已经没有cuda11.3了,应该用pytorch-1.13.1cu117。 同时,没有应用量化的话,fp16需要32G显存,3080肯定不够的 运行量化版本需要编译gptq,需要安装cuda tool kit进行编译,和pytorch一个cuda版本

yhyu13 avatar Apr 27 '23 03:04 yhyu13