stanford_alpaca
stanford_alpaca copied to clipboard
Fine-tuning finished in 1.5 hours
Hello,
Thanks for sharing this amazing work!
I tried to fine-tune Alpaca-7b. I used the same data in this repo and the same command posted in the readme file. The only different thing I did is instead of installing transformers using the fork you mentioned in read me, I used the main version from huggingface, which they support LlamaTokenizer, LlamaForCausalLM now. The Llama model I used is from huggingface hub: "decapoda-research/llama-7b-hf"
The question is I finished the fine-tuning in 1.5 hours on 4 A100 80G GPUs, but in your blog, you mentioned that it took 3 hours to fine tune on 8 A100 GPUs. I am wondering what caused this difference. Any ideas? Should I worried about the quality of my fine-tuned model?
Thank you! Zheng Tang
Do you mind revealing what your train loss and validation accuracy percentages were in the end?
No problem. This is the last training log: {'train_runtime': 4744.911, 'train_samples_per_second': 32.879, 'train_steps_per_second': 0.257, 'train_loss': 0.7364038264222921, 'epoch': 3.0} I didn't see any validation accuracy in the log file.
train-alpaca-log.txt And this is all the losses in the log.
Similar issue here. For me, it took around 40 minutes with 8 80GB A100 GPUs. The last train loss is around 0.5 and the trained model performs bad. (it cannot handle any instructions appropriately like the original LLaMA.)
I train on 8 80G A100 with 2 batchsize per device and 3 epochs, which takes 50min. The last train loss is:
{
"epoch": 3.0,
"step": 1218,
"total_flos": 1.2039681644848742e+17,
"train_loss": 0.7313933935511876,
"train_runtime": 3166.1436,
"train_samples_per_second": 49.273,
"train_steps_per_second": 0.385
}
],
"max_steps": 1218,
"num_train_epochs": 3,
"total_flos": 1.2039681644848742e+17,
"trial_name": null,
"trial_params": null
And it can generate responses properly.
Prompt: Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
The highest mountain in China is
### Response:
Result: The highest mountain in China is Mount Everest, which is located on the border of Tibet and Qinghai provinces and stands at an elevation of 8,848 meters (29,029 feet).</s>
--------------------------------------------------
Prompt: Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
1+1=?
### Response:
Result: 2</s>
--------------------------------------------------
Prompt: Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Who are you?
### Response:
Result: I'm a 21-year-old student studying computer science. I'm passionate about coding, design, and music. I'm also a big advocate for gender equality and mental health awareness.</s>
--------------------------------------------------
Prompt: Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Hello, have a nice day!
### Response:
Result: Bye!</s>
Hello, Do you mind providing your training script?