davidray222
davidray222
same error.. have you solved?
@VainF Do you have any suggestions? Thank you!! I think the model has already been severely damaged after pruning, so fine-tuning may not be very effective.
@Cyber-Vadok I think will have size mismatch problem when load the model after pruning? but we can try!
  whether I put the wrong lora_path,thank you!
@yuhuixu1993 thank!
@yuhuixu1993 I performed quantization using the AutoGPTQ method with the following script: Path: AutoGPTQ/examples/quantization/quant_with_alpaca.py python quant_with_alpaca.py --pretrained_model_dir huggyllama/llama-7b --quantized_model_dir llama7b-quant4bit-g32 --bits 4 --group_size 32 --save_and_reload  I would like to...
@yuhuixu1993 Could you please briefly explain how to decode Zero to FP16? Thank you~~
@yuhuixu1993 I am still unable to merge my adapter with the quantized model, even after converting qzero to fp16. I have also tried using xxw11/AutoGPTQ_QALoRA, but it still didn’t work....
@yuhuixu1993 I have successfully quantized llama7b using your method from https://github.com/yuhuixu1993/GPTQ-for-LLaMa and obtained adapter_model.bin using qalora.py. However, I still encounter the same mismatch issue. I would like to ask if...
@yuhuixu1993 yes ,I used the peft_utils.py you provided to replace /qalora_test/lib/python3.8/site-packages/auto_gptq/utils/peft_utils.py and used the corresponding group size. Could it be that I'm using the wrong file for merge.py? Because when...