stanford_alpaca
stanford_alpaca copied to clipboard
Finetune with A100 40G
Can we use A100 40G to finetune llama-7B? Is there anyone try that?
I try 8 A100 40g to finetune llama-7B with FSDP offload, it works fine for me.
Thank you so much for the response! Did you try 4 A100 40G as well?
I tried 4 A100 40GB with FSDP offload, but had to reduce the eval and train batch size from 3 to 2 in order to avoid OOM. Took 58 hours.
I tried 4 A100 40GB with FSDP offload, but had to reduce the eval and train batch size from 3 to 2 in order to avoid OOM. Took 58 hours.
I tried the same configuration 4 A100 40G, but it still OOM. Can you publish your parameter settings? Thanks! @ffohturk