ms-swift
ms-swift copied to clipboard
Training llama 3.1 70B using 4 A6000
I need to train llama 3.1 70B in 8 bit Qlora or something using 4 * A6000 GPUS = 192 GB total VRAM. Can I do it?
Or what is the best way to train it using 192 GB total VRAM?
You can try running it with LoRA.
CUDA_VISIBLE_DEVICES=0,1,2,3 \
swift sft \
--model_type llama3_1-70b-instruct \
--dataset <dataset> \
--num_train_epochs 5 \
--sft_type lora \
--output_dir output \
...