GPTQ-for-LLaMa
GPTQ-for-LLaMa copied to clipboard
inference with the saved model error: AttributeError: module 'torch.backends.cuda' has no attribute 'sdp_kernel'
Loading model ...
Found 3 unique KN Linear values.
Warming up autotune cache ...
100%|█████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:34<00:00, 2.85s/it]
Found 1 unique fused mlp KN values.
Warming up autotune cache ...
100%|█████████████████████████████████████████████████████████████████████████████████████████| 12/12 [00:17<00:00, 1.45s/it]
Done.
Traceback (most recent call last):
File "llama_inference.py", line 120, in
I meet the same error, have you solve it? Is this a problem with the torch version?
I meet the same error, have you solve it? Is this a problem with the torch version?
I got the same error in PyTorch 1.12.1. After I updated to 2.0.1 it's gone