FastChat out of gpu memory using 4xA100 40G

Hi, I used the training script in the readme, and didn't change the data and parameters, but my gpu memory still run out. Have you test it on 4xA100 40Gb? What about the usage of your gpu memory?

Apr 13 '23 11:04 puppet101

yeah, same situation. Even downsize the

    --per_device_train_batch_size 1  # original 2

still OOM

Maybe some heroes can solve this using deepspeed?

Apr 15 '23 05:04 CiaoHe

we have tried to train the 7b model on A100 40G * 8, with default settings. And all GPU memories are almost eaten up. If set batchsize to 1, the model still consumes about 30G on each card. Thus I think the minimum requirement for training vicuna is 8 cards, 4 cards simply will not do the work.

Apr 15 '23 06:04 yzxyzh

we have tried to train the 7b model on A100 40G * 8, with default settings. And all GPU memories are almost eaten up. If set batchsize to 1, the model still consumes about 30G on each card. Thus I think the minimum requirement for training vicuna is 8 cards, 4 cards simply will not do the work.

agree. Btw How about the iteration speed for 8A100 40G @yzxyzh

Apr 15 '23 06:04 CiaoHe

we have tried to train the 7b model on A100 40G * 8, with default settings. And all GPU memories are almost eaten up. If set batchsize to 1, the model still consumes about 30G on each card. Thus I think the minimum requirement for training vicuna is 8 cards, 4 cards simply will not do the work.

agree. Btw How about the iteration speed for 8A100 40G @yzxyzh

our speed is about 90s/it.

Apr 15 '23 11:04 yzxyzh

Closing, as the issue has been resolved.

May 08 '23 07:05 zhisbug

@yzxyzh You mentioned you are using A100 40G * 8, however the README.md said. said that You can use the following command to train Vicuna-7B with 4 x A100 (40GB).

Is this just a typo?

May 16 '23 17:05 ryusaeba

FastChat FastChat copied to clipboard

out of gpu memory using 4xA100 40G

FastChat
FastChat copied to clipboard