FastChat How to run train

trafficstars

Hi Team,

Can someone please provide the command line instruction to execute train_lora.py file as the readme only contains the command for train_mem.py

Thank you

Apr 25 '23 06:04 samarthsarin

Me too! Also ask how to train lora for Vicuna.

Apr 25 '23 07:04 tjb-tech

Please check here(mention that the given configuration may not work, and it's just an example of the use case)

Apr 25 '23 22:04 ZYHowell

Have you managed to get the lora running ?

Apr 27 '23 08:04 alexanderfrey

I have not run it successfully, because even I used the lora train, there is still a Out-of-memery error in my GPU (3090-24G) when I fine tune 7B base model. So, how much memory of GPU are required when using lora train? And I am also interested in why I could easily run lora train for alpaca-lora when I fine tune the same 7B base model. Is FastChat more heavy?

Apr 27 '23 09:04 tjb-tech

For oom, please try to add these lines from Alpaca-Lora. I'll add a PR if that works.

Apr 27 '23 16:04 ZYHowell

I have tried this, it does train with alpaca-lora, but when I try inference - it produce result - like it was not trained.

Apr 27 '23 19:04 pauliustumas

The flash_attn is not supported.Use load_in_8bit,peft technology and bitsandbytes to accelerate.It requires about 13G of GPU memory.

https://github.com/git-cloner/llama-lora-fine-tuning#341-fine-tuning-command for the training script

train_lora.py needs to be modified, refer to: https://github.com/git-cloner/llama-lora-fine-tuning/blob/main/fastchat/train/train_lora.py and
https://github.com/git-cloner/llama-lora-fine-tuning/blob/main/deepspeed-config.json

May 31 '23 06:05 little51

@little51 How to support multi-GPU training？

The flash_attn is not supported.Use load_in_8bit,peft technology and bitsandbytes to accelerate.It requires about 13G of GPU memory.

https://github.com/git-cloner/llama-lora-fine-tuning#341-fine-tuning-command for the training script

train_lora.py needs to be modified, refer to: https://github.com/git-cloner/llama-lora-fine-tuning/blob/main/fastchat/train/train_lora.py and https://github.com/git-cloner/llama-lora-fine-tuning/blob/main/deepspeed-config.json

May 31 '23 12:05 zl1994

@zl1994 Multi gpu will encounter RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu, which has not been resolved yet.

May 31 '23 12:05 little51

@zl1994 Multi gpu will encounter RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu, which has not been resolved yet.

Yeah, I have encountered the same problem and am still working on it. If you have any good news, please let me know😂. Thank you

Jun 01 '23 03:06 zl1994

If you have multiple GPUs, update fastchat/train/train_lora.py and using --num_gpus parameter,such as : CUDA_VISIBLE_DEVICES=0,1
deepspeed --num_gpus=2 fastchat/train/train_lora.py \ --deepspeed ./deepspeed-config.json \ --lora_r 8 \ ... ... https://github.com/git-cloner/llama-lora-fine-tuning/blob/main/fastchat/train/train_lora.py and https://github.com/git-cloner/llama-lora-fine-tuning#341-fine-tuning-command

Jun 02 '23 13:06 little51

So, in a single NVIDIA 3090, Fastchat doesn't support finetuning with lora on vicuna-7B model, right?

Jun 15 '23 05:06 Dandelionym

Not LoRa reason, it's because of flash_attn problem, you need to test on the 3090 to see if flash_attn is getting an error

Jun 16 '23 08:06 little51

Here is a validated implementation for finetuning vicuna-7b on a single 3090 gpu or multiple gpus: https://github.com/chengzl18/vicuna-lora

Jul 02 '23 22:07 chengzl18

FastChat
FastChat copied to clipboard

How to run train_lora.py

FastChat FastChat copied to clipboard

How to run train_lora.py

FastChat
FastChat copied to clipboard