AutoAWQ icon indicating copy to clipboard operation
AutoAWQ copied to clipboard

DeepSeek-Coder-V2-Lite-Instruct Error!

Open tohnee opened this issue 1 year ago • 1 comments

Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:13<00:00, 3.28s/it] Repo card metadata block was not found. Setting CardData to empty. Token indices sequence length is longer than the specified maximum sequence length for this model (132274 > 16384). Running this sequence through the model will result in indexing errors AWQ: 0%| | 0/27 [00:05<?, ?it/s] Traceback (most recent call last): File "/testspace/repo/deepseek/AutoAWQ/tests/deepseek_quantize.py", line 33, in model.quantize(tokenizer, quant_config=quant_config, calib_data=load_wikitext()) File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/testspace/repo/deepseek/AutoAWQ/awq/models/base.py", line 232, in quantize self.quantizer.quantize() File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 166, in quantize scales_list = [ File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 167, in self._search_best_scale(self.modules[i], **layer) File "/opt/conda/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 330, in _search_best_scale best_scales = self._compute_best_scale( File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 391, in _compute_best_scale self.pseudo_quantize_tensor(fc.weight.data)[0] / scales_view File "/testspace/repo/deepseek/AutoAWQ/awq/quantize/quantizer.py", line 76, in pseudo_quantize_tensor assert org_w_shape[-1] % self.group_size == 0 AssertionError

tohnee avatar Oct 30 '24 12:10 tohnee

seconding this Issue with autoawq-0.2.7.post3

stas00 avatar Jan 08 '25 01:01 stas00