After DPO training with LoRA adapter, generate completions with original model

Open athoag-sony opened this issue 2 years ago • 1 comments

I am new to fine-tuning. I ran through this very helpful notebook and got it to run locally: https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing#scrollTo=EWGFqAo5Q2me

I assume that after the dpo_trainer.train() is finished, the "model" object is now the fine-tuned model updated in place. Is it possible to remove the LoRA adapter and to recover the "original" model, i.e. the model before fine-tuning by deactivating the LoRA adapter? Or do I have to reload the original model into memory? I'd like to generate completions using the original and fine-tuned models to compare.

Mar 15 '24 20:03 athoag-sony

@athoag-sony Oh I'll edit the DPO notebook to add saving methods on the bottom!

You can use with model.disable_adapter(): ... to call the disabled LoRA adapter model

Mar 16 '24 01:03 danielhanchen