GPTQ-for-LLaMa inference with the saved model error: AttributeError: module 'torch.backends.cuda' has no attribute 'sdp

inference with the saved model error: AttributeError: module 'torch.backends.cuda' has no attribute 'sdp_kernel'

Open LuciaIsFine opened this issue 2 years ago • 2 comments

Loading model ... Found 3 unique KN Linear values. Warming up autotune cache ... 100%|█████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:34<00:00, 2.85s/it] Found 1 unique fused mlp KN values. Warming up autotune cache ... 100%|█████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:17<00:00, 1.45s/it] Done. Traceback (most recent call last): File "llama_inference.py", line 120, in generated_ids = model.generate( File "/opt/conda/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/opt/conda/lib/python3.8/site-packages/transformers/generation/utils.py", line 1485, in generate return self.sample( File "/opt/conda/lib/python3.8/site-packages/transformers/generation/utils.py", line 2524, in sample outputs = self( File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/opt/conda/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 687, in forward outputs = self.model( File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/opt/conda/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 577, in forward layer_outputs = decoder_layer( File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/opt/conda/lib/python3.8/site-packages/transformers/models/llama/modeling_llama.py", line 292, in forward hidden_states, self_attn_weights, present_key_value = self.self_attn( File "/opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1130, in _call_impl return forward_call(*input, **kwargs) File "/workspace/luzijia/GPTQ-for-LLaMa-triton/quant/fused_attn.py", line 154, in forward with torch.backends.cuda.sdp_kernel(enable_math=False): AttributeError: module 'torch.backends.cuda' has no attribute 'sdp_kernel'

Jun 29 '23 06:06 LuciaIsFine

I meet the same error, have you solve it? Is this a problem with the torch version?

Oct 19 '23 10:10 lyhaha2020

I meet the same error, have you solve it? Is this a problem with the torch version?

I got the same error in PyTorch 1.12.1. After I updated to 2.0.1 it's gone

Nov 27 '23 08:11 TingxunShi

GPTQ-for-LLaMa GPTQ-for-LLaMa copied to clipboard

inference with the saved model error: AttributeError: module 'torch.backends.cuda' has no attribute 'sdp_kernel'

GPTQ-for-LLaMa
GPTQ-for-LLaMa copied to clipboard