lit-llama Finetune LLAMA-65B using LoRA

Hi, Are there any guidance/examples on fine-tuning 65B models with LoRA?

Jun 08 '23 20:06 jiaaoc

Yes, good news is that it's possible.

Generally, it works the same way as outlined in finetuning with LoRA howto: https://github.com/Lightning-AI/lit-llama/blob/main/howto/finetune_lora.md

By default, when you run python finetune/lora.py it uses the checkpoints/lit-llama/7B/lit-llama.pth checkpoint. But you can use the --pretrained_path flag to run

python finetune/lora.py --pretrained_path checkpoints/lit-llama/65B/lit-llama.pth

If you bump into GPU memory issues, you can try to

increase the number of devices here on line 55 -- it will automatically use tensor sharding then
reduce the microbatch size here on line 36 from 4 to 2 or 1 -- the results will be the same due to gradient accumulation; it will just run a bit slower through the dataset.

Jun 08 '23 21:06 rasbt

Thanks for the suggestions!

Jun 08 '23 21:06 jiaaoc

Is it possible to lora finetune 65B model on 8 V100(32GB)?

Jun 08 '23 21:06 jiaaoc

I don't now for sure, but it worked for me on 8 A100 cards

Jun 08 '23 22:06 rasbt

Hi rasbt, thanks very much for sharing this project. I can run llama-lora on my local server without much struggle. Is it possible to fine-tune the 65B model on two or more (A100) servers just by configuring the hyperparameters?

Jun 13 '23 02:06 richardsun-voyager

nope，i guess at least 8A100-80GB needed for fine-tuning llama65B, i have tried 4A100-80GB just configured the hyperparameters but OOM errors happend

Jun 28 '23 03:06 ChaoyuHuang

I don't now for sure, but it worked for me on 8 A100 cards

Hi @rasbt,

I'm experiencing an issue while running the lora.py code for finetuning the Llama 65b model with default choices. I have a machine with 8 A100SXM 80GB GPUs. In order to utilize all the GPUs, I changed the device number to 8 and reduced the microbatch size to 1, while keeping everything else the same.

During the first attempt, I encountered a "cuda out of memory" error. I monitored the CPU and GPU usage and noticed that the CPU RAM total memory, which is 2TB, was nearly fully utilized. However, when I tried again, there was no "gpu out of memory" error. I suspect that the CPU may be the limiting factor here.

Could you please let me know the amount of RAM memory your CPU has? If you also have 2TB, do you have any suggestions for resolving this issue? Additionally, I wanted to confirm if you used the Llama 65b model from Hugging Face's decapoda-research/llama-65b-hf repository.

Thank you for your assistance!

Jun 29 '23 15:06 weilong-web

thks for your answer，i‘ve downloaded llama checkpoint from huggyllama

发自我的iPhone

------------------ Original ------------------ From: weilong @.> Date: Thu,Jun 29,2023 11:43 PM To: Lightning-AI/lit-llama @.> Cc: ChaoyuHuang @.>, Comment @.> Subject: Re: [Lightning-AI/lit-llama] Finetune LLAMA-65B using LoRA (Issue#375)

I don't now for sure, but it worked for me on 8 A100 cards

Hi @rasbt,

I'm experiencing an issue while running the lora.py code for finetuning the Llama 65b model with default choices. I have a machine with 8 A100SXM 80GB GPUs. In order to utilize all the GPUs, I changed the device number to 8 and reduced the microbatch size to 1, while keeping everything else the same.

During the first attempt, I encountered a "cuda out of memory" error. I monitored the CPU and GPU usage and noticed that the CPU RAM total memory, which is 2TB, was nearly fully utilized. However, when I tried again, there was no "gpu out of memory" error. I suspect that the CPU may be the limiting factor here.

Could you please let me know the amount of RAM memory your CPU has? If you also have 2TB, do you have any suggestions for resolving this issue? Additionally, I wanted to confirm if you used the Llama 65b model from Hugging Face's decapoda-research/llama-65b-hf repository.

Thank you for your assistance!

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.***>

Jul 03 '23 02:07 ChaoyuHuang

lit-llama lit-llama copied to clipboard

Finetune LLAMA-65B using LoRA

lit-llama
lit-llama copied to clipboard