audiocraft icon indicating copy to clipboard operation
audiocraft copied to clipboard

How to train a medium or large model with limited GPU capacity?

Open ElizavetaSedova opened this issue 1 year ago • 3 comments

I have cards of 24 GB each. The error appears when trying to train the medium model torch.cuda.OutOfMemoryError. Is there a way to train a medium or large model on my cards? I will be glad to any advice!

ElizavetaSedova avatar Dec 12 '23 17:12 ElizavetaSedova

Same issue here. I believe you have to use fsdp = true & autocast = false.

Saltb0xApps avatar Dec 13 '23 19:12 Saltb0xApps

@Saltb0xApps Unfortunately this doesn't work for me when training a medium model. I tested this on a small model and noticed that the overall memory consumption did not change at all, with all other parameters being the same. I used the smallest batch size.

ElizavetaSedova avatar Dec 18 '23 12:12 ElizavetaSedova

Try lowering the batch_size and/or epochs

astralmedia avatar Dec 25 '23 19:12 astralmedia