AutoAWQ
AutoAWQ copied to clipboard
bloomz_7b1 error message TypeError: forward() missing 1 required positional argument: 'alibi'
RUNNING
from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer
model_path = '/data/bloomz_7b1'
quant_path = '/data/bloomz_7b1_4bit'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 8, "version": "GEMM"}
# Load model
# NOTE: pass safetensors=True to load safetensors
model = AutoAWQForCausalLM.from_pretrained(
model_path, **{"low_cpu_mem_usage": True, "use_cache": False}
)
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
# Quantize
model.quantize(tokenizer, quant_config=quant_config,text_column="text")
# Save quantized model
model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)
ERROR:
/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/huggingface_hub/repocard.py:105: UserWarning: Repo card metadata block was not found. Setting CardData to empty.
warnings.warn("Repo card metadata block was not found. Setting CardData to empty.")
AWQ: 0%| | 0/30 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/data/LLaMA-Factory/autoawq_demo/quan_demo.py", line 20, in <module>
model.quantize(tokenizer, quant_config=quant_config,text_column="text")
File "/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/awq/models/base.py", line 93, in quantize
quantizer.quantize()
File "/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/awq/quantize/quantizer.py", line 95, in quantize
input_feat = self._get_input_feat(self.modules[i], named_linears)
File "/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/awq/quantize/quantizer.py", line 406, in _get_input_feat
self.inps = layer(self.inps, **module_kwargs)[0]
File "/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
TypeError: forward() missing 1 required positional argument: 'alibi'
How should I resolve this error?
THX
Which transformers version are you using? Could you try 4.34.1 and 4.35.2? A little background... recently the 4.36 version broke a lot of things around how we cache arguments in AutoAWQ, which we mostly fixed but there are still edge cases like this.
CC: @younesbelkada
Which transformers version are you using? Could you try
4.34.1and4.35.2? A little background... recently the4.36version broke a lot of things around how we cache arguments in AutoAWQ, which we mostly fixed but there are still edge cases like this.CC: @younesbelkada
At first, I was using 4.36.2, but I have now tried 4.34.1 and 4.35.2 without resolving the issue.
4.34.1
ImportError: cannot import name 'insecure_hashlib' from 'huggingface_hub.utils' (/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/huggingface_hub/utils/__init__.py)
4.35.2
ImportError: cannot import name 'MoeModelOutputWithPast' from 'transformers.modeling_outputs' (/root/miniconda3/envs/glmvllm/lib/python3.9/site-packages/transformers/modeling_outputs.py)
Hi, I've similar issue while quantazing Falcon1B model:
TypeError: FalconDecoderLayer.forward() missing 1 required positional argument: 'alibi'
Here is my code:
from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer
model_path = "tiiuae/falcon-rw-1b"
quant_path = 'falcon-rw-1b-awq'
quant_config = { "zero_point": True, "q_group_size": 128, "w_bit": 4, "version": "GEMM" }
# Load model
model = AutoAWQForCausalLM.from_pretrained(model_path, **{"low_cpu_mem_usage": True})
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
# Quantize
model.quantize(tokenizer, quant_config=quant_config)
# Save quantized modela
model.save_quantized(quant_path)
tokenizer.save_pretrained(quant_path)
My setup:
- CUDA v12.1
- PyTorch v2.2.0
- autoawq==0.2.3
- autoawq_kernels==0.0.6
@oreojason have you managed to fix this issue, or probably you @casper-hansen already figured out what was the reason for such behaviour?
Hi, I still have not figured out what the issue is. I tried with Falcon-7B and it throws the same error. Quantizing Falcon-7B works with autoawq==0.1.7 (and prior, 0.1.6 also worked). Give it a try.
My setup: torch==2.1.2 torchvision==0.16.2 autoawq==0.1.7 autoawq_kernels==0.0.6
The problem for me is, converting AWQ to GGUF. This support is added in autoawq==0.2.0. However, applying AWQ scales fails by throwing this error:
TypeError: FalconDecoderLayer.forward() missing 1 required positional argument: 'alibi'.
I have tried autoawq>=0.2.0 and it still fails. Any idea @casper-hansen?
It seems the implementation broke a while ago. Unfortunately, I do not currently have the capacity to research old models that break with new updates. I will welcome all PRs and help review them if you want to research how to fix the issue
I have tried awq on falcon-1b and it is giving me the same issue. TypeError: FalconDecoderLayer.forward() missing 1 required positional argument: 'alibi'. my versions:- autoawq==0.2.5 autoawq_kernels==0.0.6 when tried with other versions it is giving me the error OSError: Error no file named pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory with the versions torch==2.1.2 torchvision==0.16.2 autoawq==0.1.7 autoawq_kernels==0.0.6
@casper-hansen @franchukpetro @49Simon @oreojason has anyone managed to fix this issue.