lit-llama
lit-llama copied to clipboard
Finetune LLAMA-65B using LoRA
Hi, Are there any guidance/examples on fine-tuning 65B models with LoRA?
Yes, good news is that it's possible.
Generally, it works the same way as outlined in finetuning with LoRA howto: https://github.com/Lightning-AI/lit-llama/blob/main/howto/finetune_lora.md
By default, when you run python finetune/lora.py
it uses the checkpoints/lit-llama/7B/lit-llama.pth
checkpoint. But you can use the --pretrained_path
flag to run
python finetune/lora.py --pretrained_path checkpoints/lit-llama/65B/lit-llama.pth
If you bump into GPU memory issues, you can try to
Thanks for the suggestions!
Is it possible to lora finetune 65B model on 8 V100(32GB)?
I don't now for sure, but it worked for me on 8 A100 cards
Hi rasbt, thanks very much for sharing this project. I can run llama-lora on my local server without much struggle. Is it possible to fine-tune the 65B model on two or more (A100) servers just by configuring the hyperparameters?
nope,i guess at least 8A100-80GB needed for fine-tuning llama65B, i have tried 4A100-80GB just configured the hyperparameters but OOM errors happend
I don't now for sure, but it worked for me on 8 A100 cards
Hi @rasbt,
I'm experiencing an issue while running the lora.py code for finetuning the Llama 65b model with default choices. I have a machine with 8 A100SXM 80GB GPUs. In order to utilize all the GPUs, I changed the device number to 8 and reduced the microbatch size to 1, while keeping everything else the same.
During the first attempt, I encountered a "cuda out of memory" error. I monitored the CPU and GPU usage and noticed that the CPU RAM total memory, which is 2TB, was nearly fully utilized. However, when I tried again, there was no "gpu out of memory" error. I suspect that the CPU may be the limiting factor here.
Could you please let me know the amount of RAM memory your CPU has? If you also have 2TB, do you have any suggestions for resolving this issue? Additionally, I wanted to confirm if you used the Llama 65b model from Hugging Face's decapoda-research/llama-65b-hf repository.
Thank you for your assistance!
thks for your answer,i‘ve downloaded llama checkpoint from huggyllama
发自我的iPhone
------------------ Original ------------------ From: weilong @.> Date: Thu,Jun 29,2023 11:43 PM To: Lightning-AI/lit-llama @.> Cc: ChaoyuHuang @.>, Comment @.> Subject: Re: [Lightning-AI/lit-llama] Finetune LLAMA-65B using LoRA (Issue#375)
I don't now for sure, but it worked for me on 8 A100 cards
Hi @rasbt,
I'm experiencing an issue while running the lora.py code for finetuning the Llama 65b model with default choices. I have a machine with 8 A100SXM 80GB GPUs. In order to utilize all the GPUs, I changed the device number to 8 and reduced the microbatch size to 1, while keeping everything else the same.
During the first attempt, I encountered a "cuda out of memory" error. I monitored the CPU and GPU usage and noticed that the CPU RAM total memory, which is 2TB, was nearly fully utilized. However, when I tried again, there was no "gpu out of memory" error. I suspect that the CPU may be the limiting factor here.
Could you please let me know the amount of RAM memory your CPU has? If you also have 2TB, do you have any suggestions for resolving this issue? Additionally, I wanted to confirm if you used the Llama 65b model from Hugging Face's decapoda-research/llama-65b-hf repository.
Thank you for your assistance!
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>