Qwen2.5
Qwen2.5 copied to clipboard
v100部署出现如下问题。挺奇怪,cutlassF: no kernel found to launch!
03/19 21:17:01 llm_wrapper.py:msg:55 Caught an exception: RuntimeError: cutlassF: no kernel found to launch! Traceback (most recent call last): File "/home/admin/workspace/aop_lab/app_source/llm_wrapper.py", line 43, in msg response, usage = await self.call(messages_, **kwargs) File "/home/admin/workspace/aop_lab/app_source/llm_wrapper.py", line 154, in call generated_ids = self.model.generate(model_inputs.input_ids, File "/home/admin/miniconda3/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/home/admin/miniconda3/lib/python3.10/site-packages/transformers/generation/utils.py", line 1544, in generate return self.greedy_search( File "/home/admin/miniconda3/lib/python3.10/site-packages/transformers/generation/utils.py", line 2404, in greedy_search outputs = self( File "/home/admin/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/admin/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/home/admin/miniconda3/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1173, in forward outputs = self.model( File "/home/admin/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/admin/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/home/admin/miniconda3/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 1058, in forward layer_outputs = decoder_layer( File "/home/admin/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/admin/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/home/admin/miniconda3/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 773, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/home/admin/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "/home/admin/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) File "/home/admin/miniconda3/lib/python3.10/site-packages/transformers/models/qwen2/modeling_qwen2.py", line 698, in forward attn_output = torch.nn.functional.scaled_dot_product_attention( RuntimeError: cutlassF: no kernel found to launch!
import torch torch.backends.cuda.enable_mem_efficient_sdp(False) torch.backends.cuda.enable_flash_sdp(False)
https://github.com/huggingface/transformers/issues/28731
这么解决的。 或者指定 "torch_dtype": torch.float16,
Also please update PyTorch >= 2.2.0
.