FastChat How to use lora to train the 30b model on multiple machines and multiple cards?

How to use lora to train the 30b model on multiple machines and multiple cards?

Open Awyshw opened this issue 2 years ago • 3 comments

Apr 27 '23 06:04 Awyshw

We've tested that the current script is runnable on multiple cards, a single machine(8 x 40GB A100) using DeepSpeed ZeRO-3. The multi-node case is not tested yet. It's WIP

Apr 30 '23 01:04 ZYHowell

We've tested that the current script is runnable on multiple cards, a single machine(8 x 40GB A100) using DeepSpeed ZeRO-3. The multi-node case is not tested yet. It's WIP

still Doing？

May 09 '23 06:05 cason0126

mark

Jun 09 '23 09:06 gebilaoman

@ZYHowell any progress on this? There was an issue about using slurm I've seen around as well...

Oct 21 '23 16:10 surak

FastChat FastChat copied to clipboard

How to use lora to train the 30b model on multiple machines and multiple cards?

FastChat
FastChat copied to clipboard