llm-foundry Gradient checkpointing issue when running QLoRA finetuning

Gradient checkpointing issue when running QLoRA finetuning

Open tytung2020 opened this issue 2 years ago • 1 comments

Finetuning the mpt-7b and mpt-30b using qlora gives the error "ValueError: MPTForCausalLM does not support gradient checkpointing.". Is there a way to fix this?

Jul 01 '23 15:07 tytung2020

are these lines of codes what is needed to make it work? cekal's amendment seems to work on the 7b version: https://huggingface.co/cekal/mpt-7b-peft-compatible/commit/a5eab52c1c61c1d50a4e01428949f6ff90c73c48 But not sure if it works fully as intended. Could someone in MosaicML check this? If so, please also implement this in the 30b version. Thanks~

Jul 12 '23 04:07 tytung2020

llm-foundry llm-foundry copied to clipboard

Gradient checkpointing issue when running QLoRA finetuning

llm-foundry
llm-foundry copied to clipboard