After DPO training with LoRA adapter, generate completions with original model
I am new to fine-tuning. I ran through this very helpful notebook and got it to run locally: https://colab.research.google.com/drive/15vttTpzzVXv_tJwEk-hIcQ0S9FcEWvwP?usp=sharing#scrollTo=EWGFqAo5Q2me
I assume that after the dpo_trainer.train() is finished, the "model" object is now the fine-tuned model updated in place. Is it possible to remove the LoRA adapter and to recover the "original" model, i.e. the model before fine-tuning by deactivating the LoRA adapter? Or do I have to reload the original model into memory? I'd like to generate completions using the original and fine-tuned models to compare.
@athoag-sony Oh I'll edit the DPO notebook to add saving methods on the bottom!
You can use with model.disable_adapter(): ... to call the disabled LoRA adapter model