gpt-fast
gpt-fast copied to clipboard
What is `torch.ops.aten._convert_weight_to_int4pack` ?
I'm using torch.version = 2.1.0a0+32f93b1
which doesn't have
AttributeError: '_OpNamespace' 'aten' object has no attribute '_convert_weight_to_int4pack'
What exactly does this do, and is it defined elsewhere?
Unfortunately upgrading to the latest torch-dev breaks flash-attention2
One possible reason for encountering this issue could be the use of a CPU build. Could you check torch.cuda.is_available()
just to be sure?
Thanks for the response:
torch.cuda.is_available() = True
No problems training either
Also note I'm using this docker image:
nvcr.io/nvidia/pytorch:23.10-py3
Ah you need the pytorch nightly build for this repo.
Have you solved this problem? I got same problem when I quantize llama 7b int4.
you should use the nightly version of torch or at least the recent 2.2 branch cut, its a newish op that was added for int4 support.