litgpt
litgpt copied to clipboard
Lora applied to all
This is an experiment (perhaps it needs to be a Draft
?) to apply LoRA to not only to query and value matrices, but to:
- query
- key
- value
- projection
- MLP
- head
as described in this issue (in Lit-LLaMa repo).
Changes include porting Linear
class from Microsoft's repo in order to use it as a replacement of nn.Linear
. MergedLinear is for attention operations, Linear - for everything else.
So now experiments could be run within CLI by providing arguments:
python finetune/lora.py --checkpoint_dir ... --query_lora True --key_lora True --value_lora True --projection_lora True --mlp_lora True --head_lora True
Naming is lame, I know, so I need a help with it.
By default (if to not provide any arguments) for fine-tuning script LoRA is applied to only query and value so it mimics the previous behavior.
I don't have a GPU in my possession, so I did a sanity check on my laptop's CPU and a GPU in google colab with pythia-70m
model and alpaca dataset. So now someone with big guns from Lightning.ai team should check whether there are any improvements or not with a much-much bigger model. Of course when have time.
Note: when I tested in Google Colab with Nvidia T4 with "16-mixed" precision I've got:
RuntimeError: probability tensor contains either
inf
,nan
or element < 0
so I tested with "32-true". T4 doesn't support bf-16
.
Still it was weird. I understand why I might got this type of error with 16-true
(as it's explained in this article from Lightning.ai blog), but with mixed precision it should work without problem, isn't it?
P.S. the same issue (or it's not an issue?) is true for the code from the main brunch.