litgpt falcon-40b out of memory

Hi! I am trying to finetne falcon-40b with a single A100 GPU of 80GB memory. I tried with decreasing the micro batch size to be 1 however it is still OOM for both adapter_v2 and lora with bfloat16-mixed / fp-16. Any suggestion on how to solve this issue without using multiple GPU? Thanks a lot!

Jun 18 '23 06:06 lynngao

I tried again with 2 A100 GPUs but still OOM. Set device = 2 and tried both lora and adapter_v2. Any help would be appreciated!

Jun 18 '23 09:06 lynngao

Falcon 40B won't fit in a single 80GB card.

I will report back when I find out what's the minimum memory requirement to fine-tune it. But I don't have access to A100 80GB right now

Jun 19 '23 19:06 carmocca

Any luck with finetuning? Running into OOM while trying to fine tune Falcon40B on a 8 GPU A100 80 GB machine. Tried reducing num_devices, micro_batch_size, lower lora rank.

Update: Looks like the recent main don't support multi GPU training. Any plans/threads to support that feature?

Jun 22 '23 23:06 gpravi

@gpravi Distributed support for LoRA is tracked in #161

Jun 22 '23 23:06 carmocca

So currently it's not possible to finetune Falcon 40B using Lit-parrot, right?

Jun 27 '23 13:06 weilong-web

@weilong-web Yeah, I don't think it works out of the box... Looks like someone managed to finetune Falcon 40b - https://github.com/Lightning-AI/lit-gpt/issues/198

Jun 28 '23 22:06 gpravi

@weilong-web Yeah, I don't think it works out of the box... Looks like someone managed to finetune Falcon 40b - #198

I am able to use this tool to finetune 40b: https://github.com/rmihaylov/falcontune

Jun 28 '23 22:06 lynngao

@lynngao

I was able to finetune on the Falcon 40B instruct 4 bit version.

Were you able to finetune Falcon 40B model? I ran into this issue while saving the checkpoint

Jun 28 '23 22:06 gpravi

@lynngao

I was able to finetune on the Falcon 40B instruct 4 bit version.

Were you able to finetune Falcon 40B model? I ran into this issue while saving the checkpoint

No I only tried the 4bit version.

Jun 28 '23 22:06 lynngao

@weilong-web Yeah, I don't think it works out of the box... Looks like someone managed to finetune Falcon 40b - #198

I am able to use this tool to finetune 40b: https://github.com/rmihaylov/falcontune

Were you able to run it in DPP or only single GPU?

Jul 05 '23 08:07 alexeiga

@lynngao

I was able to finetune on the Falcon 40B instruct 4 bit version.

Were you able to finetune Falcon 40B model? I ran into this issue while saving the checkpoint

downgrading bitsandbytes to 0.37.2 worked for me (took me a few days to find this thread..) https://github.com/TimDettmers/bitsandbytes/issues/324

Jul 05 '23 08:07 alexeiga

@alexeiga Nice. Can you please let us know the configurations?

Also, the current main branch doesn't implement multi gpu training. How did you manage to implement it?

Jul 05 '23 17:07 gpravi

@alexeiga Nice. Can you please let us know the configurations?

Also, the current main branch doesn't implement multi gpu training. How did you manage to implement it?

i tried to, but without success... was only able to go single gpu, but training is VERY slow.

Jul 11 '23 05:07 alexeiga

QLoRA finetuning support is tracked in #176. Until that is supported, you can try the suggestions described in https://github.com/Lightning-AI/lit-gpt/blob/main/tutorials/oom.md

Jul 12 '23 12:07 carmocca

litgpt litgpt copied to clipboard

falcon-40b out of memory

litgpt
litgpt copied to clipboard