llama2.rs
llama2.rs copied to clipboard
Tensor has shape torch.Size([448, 1024]) ... this looks incorrect.
Thank you for building this - very interested in trying it. In my hands, when I try to export the model to a .bin I get the following error - is this something simple / user error?
(MacOS Ventura 13.5.1 w/ Conda Environment)
❯ python export.py l70b.act64.bin TheBloke/llama-2-70b-Guanaco-QLoRA-GPTQ gptq-4bit-64g-actorder_True
CUDA extension not installed.
Traceback (most recent call last):
File "/Users/timothypark/dev/llama2.rs/export.py", line 150, in <module>
load_and_export(model_name, revision, output_path)
File "/Users/timothypark/dev/llama2.rs/export.py", line 128, in load_and_export
model = AutoGPTQForCausalLM.from_quantized(model_name,
File "/opt/homebrew/anaconda3/envs/pytorch/lib/python3.10/site-packages/auto_gptq/modeling/auto.py", line 105, in from_quantized
return quant_func(
File "/opt/homebrew/anaconda3/envs/pytorch/lib/python3.10/site-packages/auto_gptq/modeling/_base.py", line 847, in from_quantized
accelerate.utils.modeling.load_checkpoint_in_model(
File "/opt/homebrew/anaconda3/envs/pytorch/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 1409, in load_checkpoint_in_model
load_offloaded_weights(model, state_dict_index, state_dict_folder)
File "/opt/homebrew/anaconda3/envs/pytorch/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 727, in load_offloaded_weights
set_module_tensor_to_device(model, param_name, "cpu", value=weight, fp16_statistics=fp16_statistics)
File "/opt/homebrew/anaconda3/envs/pytorch/lib/python3.10/site-packages/accelerate/utils/modeling.py", line 281, in set_module_tensor_to_device
raise ValueError(
ValueError: Trying to set a tensor of shape torch.Size([448, 1024]) in "qzeros" (which has shape torch.Size([224, 1024])), this look incorrect.