bitsandbytes icon indicating copy to clipboard operation
bitsandbytes copied to clipboard

Increased VRAM consumption when coupled with DDP

Open TParcollet opened this issue 11 months ago • 0 comments

System Info

Hello there,

I'm fine-tuning a Llama 3 model from HuggingFace with PeFT and BitsAndBytes. Interestingly, when wrapping the model with DDP, the training end up taking more VRAM on the master GPU. More interestingly, this VRAM increases with the number of GPUs. Do you see any reason why this could happen?

Reproduction

Not easy to produce

Expected behavior

VRAM consumption is constant w.r.t number of DDP processes.

TParcollet avatar Mar 04 '25 13:03 TParcollet