南栖

Results 53 comments of 南栖

I tried it again and it worked, ![image](https://github.com/PanQiWei/AutoGPTQ/assets/76865636/3ca512ba-be78-48a2-b55b-6a71efaf2510) code: from transformers import AutoTokenizer, TextGenerationPipeline from transformers import LlamaForCausalLM,LlamaTokenizer from auto_gptq import AutoGPTQForCausalLM, BaseQuantizeConfig import logging logging.basicConfig( format="%(asctime)s %(levelname)s [%(name)s] %(message)s",...

It's ok in quant.py weights = (self.scales[self.g_idx.long()] * (weight - zeros[self.g_idx.long()]))

切换transformers的版本,例如4.37.2

It's available at this branch:https://github.com/Minami-su/attention_sinks_autogptq @synacktraa

And then I try this:pip install git+https://github.com/tomaarsen/attention_sinks.git@model/qwen_fa error happen: ``` The repository for Qwen-7B-Chat2 contains custom code which must be executed to correctlyload the model. You can inspect the repository...