long_llama icon indicating copy to clipboard operation
long_llama copied to clipboard

How much vram needed to finetune 3b model? Is 12gb enough?

Open universewill opened this issue 1 year ago • 1 comments

How much vram needed to finetune 3b model? Is 12gb enough?

universewill avatar Sep 28 '23 09:09 universewill

Unfortunately, 12GB is not enough to finetune the 3B model in the standard way (tuning all parameters). That is because of optimizer variables and values for gradient accumulation. This Hugging Face blog post briefly describes how much each of those parts contributes to VRAM usage. For our model, we have used a single A100 80GB GPU and usage metrics show that > 70GB of the GPU memory was allocated.

CStanKonrad avatar Sep 29 '23 12:09 CStanKonrad