litgpt
litgpt copied to clipboard
Add LongLora for both full and lora fine-tuning
Follow up of #1346.
This PR introduces LongLora as in https://github.com/Lightning-AI/litgpt/issues/1237 for both the LoRa and full fine-tuning, while also enabling it during generation.
cc @rasbt
@rasbt to answer your previous question: LongLora is not enabled by default as both longlora_context_length
and longlora_n_groups
are None, but i agree with you to have a simpler way to enable it. As you suggested i can add a LongLoraArgs
as you have done in the galore branch: in this way i can also check those args in a separate function (like
the validate_train_args
)
Thanks! I think LongLoraArgs might be better, especially if it can be used in multiple approaches, e.g., full
and lora
I've just trained a model with
python litgpt/finetune/lora.py \
--config=/teamspace/studios/this_studio/litgpt/config_hub/finetune/mistral-7b/longlora.yaml \
--checkpoint_dir=/teamspace/studios/this_studio/litgpt/checkpoints/mistralai/Mistral-7B-Instruct-v0.1
One generation that I've obtained with
python litgpt/generate/base.py \
--checkpoint_dir ../out/finetune/lora-mistral-7b/final \
--prompt="Recommend a movie for me to watch during the weekend and explain the reason." \
--max_new_tokens=128
is the following:
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
Recommend a movie for me to watch during the weekend and explain the reason.
### Response:
I recommend the movie "Inception" directed by Christopher Nolan. This is an excellent sci-fi thriller that will keep you on the edge of your seat throughout the entire film. The story follows a professional thief, Dom (played by Leonardo DiCaprio), who is able to steal information from someone's subconscious while they dream. Dom is offered a chance at clemance in exchange for performing the near-impossible task of planting an idea into someone's mind, an act known as Inception.
This movie is a fantastic choice for the weekend because it's not only entertaining, but it also challenges you to think critically about the concepts presented within the film. The plot is twisting and turning, keeping you engaged from beginning to end. Additionally, the special effects and visuals are stunning, making for a truly immersive viewing experience. Moreover, with its all-star cast, including Joseph Gordon-Levitt, Ellen Page, and Tom Hardy, you know you're in for a treat.
Overall, "Inception" is an outstanding choice for the weekend because it provides an exciting and thought-provoking movie experience that is sure to leave a lasting
Time for inference 2: 15.43 sec total, 16.59 tokens/sec
Memory used: 14.67 GB
Nice, this is a good sign that things work!
What are the other options? Are "wte,norm,ln" the only allowed ones or are there more? In the paper the authors have specified that to increase the context length while using LoRA and be effective you also need to fine-tune the embedding layer and every norm layer (ref. Table 2) without specifying anything else. I put there the defaults for the LoRA fine-tuning and leave it to the user the experimentation with other values
Oh sorry, I wasn't clear. I meant more like what are the supported options here? What values can a user typically put in? E.g., analogous to
https://github.com/Lightning-AI/litgpt/blob/b9ddd8bdd8e759702ddb5b624333f422b4e76b5e/litgpt/pretrain.py#L46
But you probably can't use Literal
here because of the various combinations within that string. But in the comments, maybe could you mention which of the terms within that comma-separated string are supported?
Sorry for the long silence, and thanks again for this great PR! I have just been a bit swamped with work lately but hopefully can circle back to it some time.
Sorry for the long silence, and thanks again for this great PR! I have just been a bit swamped with work lately but hopefully can circle back to it some time.
Absolutely no worries! Thank you for taking the time to watch this!