alpaca-lora Does the instruction fine-tuning code work with LLAMA-2?

Does the instruction fine-tuning code work with LLAMA-2?

Open zl-liu opened this issue 1 year ago • 6 comments

The fine-tuning code runs when I replace the base model with LLAMA-2.

I am aware that LLAMA and LLAMA-2 share the same configuration files and other associated components.

However, I would still like to solicit the thoughts and insights of the experienced members of the community. Thank you for your input!

Jul 25 '23 02:07 zl-liu

I have been successfully executing the finetune.py script on the HF converted Llama2-7B-Chat for a few hours now, so I would be inclined say that it works. :)

Jul 26 '23 02:07 RazvanBerbece

@RazvanBerbece Thanks for your input!

Jul 26 '23 03:07 zl-liu

Can the alpaca-lora work with different (smaller) models like: 'facebook/bart-large' for example? I am trying to train it right now with 'facebook/bart-large' and I see that both the train loss and the eval loss decrease to almost zero, very quickly, during training. Any idea why?

EDIT: Bart is a masked language model, and apparently not suited for 'instruct' tasks so much. when I switched to causal language model, like OPT-125M, the problem was solved. the MLM just copied the prompt.

Jul 29 '23 13:07 ndvbd

hello, I used the same code, conda env, hyper-parameters and dataset, just replaced the base model with LLAMA-2. It did run, but the loss couldn't converge, which was the problem I've never seen when fine-tuning the base model. Does your code work? I'm wondering whether this problem is caused by the version of transfomers or peft. The version of transformers is 4.30.2, and the peft is 0.3.0.

Aug 10 '23 02:08 TBAsoul

Yeah, I think when it comes to this repo, you have to be aware of the dependencies versioning. I had the same issue recently with llama v1 so I had to revert the versioning of peft and transformers. I am betting for llama v2, there is another set of version requirement.

Aug 17 '23 16:08 timothylimyl

@TBAsoul same issue. Did you find a solution?

Sep 12 '23 17:09 Oyiyi

alpaca-lora alpaca-lora copied to clipboard

Does the instruction fine-tuning code work with LLAMA-2?

alpaca-lora
alpaca-lora copied to clipboard