alignment-handbook
alignment-handbook copied to clipboard
wierd conversation with zephyr-7b-dpo-lora
When chatting with zephyr-7b-dpo-lora, as shown in the fig above , only the first 'Hello' was I sent, all the following content are generated by zephyr, including the <user> prompt. I cannot figure out why.
Try loading the sft adapter first. Then merge the adapter into the base model and than load the dpo adapter. U can use the following code:
model_name = "alignment-handbook/zephyr-7b-sft-lora" tokenizer = AutoTokenizer.from_pretrained("alignment-handbook/zephyr-7b-sft-lora") model = AutoPeftModelForCausalLM.from_pretrained( model_name, device_map="auto", use_flash_attention_2=True, torch_dtype = torch.bfloat16, use_cache=True ) print("Merging Model") model = model.merge_and_unload() print("Model Merged")
peft_config = PeftConfig.from_pretrained("alignment-handbook/zephyr-7b-dpo-lora")
model = PeftModel.from_pretrained(model, "alignment-handbook/zephyr-7b-dpo-lora")
Try loading the sft adapter first. Then merge the adapter into the base model and than load the dpo adapter. U can use the following code:
Where do I put these code?