GPTQ-for-LLaMa
GPTQ-for-LLaMa copied to clipboard
Will loras work with this?
See: https://github.com/tloen/alpaca-lora/blob/main/generate.py
Tried modifying the code to look like this, but no luck initially.
from peft import PeftModel from transformers import LLaMATokenizer, LLaMAForCausalLM, GenerationConfig
tokenizer = LLaMATokenizer.from_pretrained("decapoda-research/llama-7b-hf")
model = LLaMAForCausalLM.from_pretrained( "decapoda-research/llama-7b-hf", load_in_8bit=True, device_map="auto", ) model = PeftModel.from_pretrained(model, "tloen/alpaca-lora-7b")