llama-cookbook
llama-cookbook copied to clipboard
General question about difference between finetuning on Huggingface's trainer and using llama-recipes finetune script
🚀 The feature, motivation and pitch
I see on the finetune tutorial there is a Huggingface's trainer notebook link. What is the difference between finetune llama3 using Huggingface's trainer notebook and using llama-recipes finetune script. Do they have a different finetune performance on llama3? What are the advantages and disadvantages of each approach?
Alternatives
No response
Additional context
No response
Hi @Tizzzzy the quick start notebook is meant to get you up and running within minutes and it supports a single Gpu. With the finetuning script on the other hand you will be able to scale up your finetuning to multiple gpus and nodes.
Hello, I want to know why you didn't use Huggingface's trainer, I found it's more quickly.
@JuiceLemonLemon reason was we were getting issue on HF trainers being opened here, which then we had to work with HF or go debug the trainer. So it was more approachable to rely on the local code here. Are you facing any issues with the recipe trainer?
@JuiceLemonLemon reason was we were getting issue on HF trainers being opened here, which then we had to work with HF or go debug the trainer. So it was more approachable to rely on the local code here. Are you facing any issues with the recipe trainer?
I just found the training time became longer compared with Huggingface's trainer. I don't know why.
@JuiceLemonLemon closing this for now as its answered, but incase you have any more details for me to reproduce-could you kindly share the same please?
I'd love to take a look and compare