Adding full finetuning
As before, this pull request contains changes to the code to feature full finetuning, files changed are lora.py, trainer,py and the LORA.md file for the new arguments.
the new training arguments are now:
python -m mlx_lm.lora \
--model \
--train \
--fine-tune-type full \
--data \
--iters 100 \
--batch-size 1 \
--val-batches 1 \
--adapter-path
to change the method of finetuning, simply change --fine-tune-type to dora, lora, or full, default is LoRA. Path where the adapters of the full model weights are stored is still --adapter-file.
Tested with gemma, mistral, llama3 (tiny versions because i only have a 8GB Macbook Air). Here is a Example with Gemma:
FULL
python -m mlx_lm.lora \
--model /Users/gokdenizgulmez/Library/Mobile\ Documents/com\~apple\~CloudDocs/Transformer\ Models/Safetensors/tiny-random-GemmaForCausalLM \
--train \
--fine-tune-type full \
--data /Users/gokdenizgulmez/Library/Mobile\ Documents/com\~apple\~CloudDocs/Datastes/data_tyni \
--iters 10 \
--batch-size 1 \
--val-batches 1 \
--adapter-path /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain
Loading pretrained model
Loading datasets
Training
Training full model weights.
Trainable parameters: 100.000% (2.049M/2.049M)
Starting training..., iters: 10
Iter 1: Val loss 12.459, Val took 0.455s
Iter 2: Saved model checkpoint weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain/model.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain/0000002_checkpoint.safetensors.
Iter 4: Saved model checkpoint weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain/model.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain/0000004_checkpoint.safetensors.
Iter 6: Saved model checkpoint weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain/model.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain/0000006_checkpoint.safetensors.
Iter 8: Saved model checkpoint weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain/model.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain/0000008_checkpoint.safetensors.
Iter 10: Val loss 12.456, Val took 0.346s
Iter 10: Train loss 12.456, Learning Rate 1.000e-05, It/sec 17.743, Tokens/sec 17372.541, Trained Tokens 9791, Peak mem 6.281 GB
Iter 10: Saved model checkpoint weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain/model.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain/0000010_checkpoint.safetensors.
Saved final full model weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain/model.safetensors.
LORA
python -m mlx_lm.lora \
--model /Users/gokdenizgulmez/Library/Mobile\ Documents/com\~apple\~CloudDocs/Transformer\ Models/Safetensors/tiny-random-GemmaForCausalLM \
--train \
--fine-tune-type lora \
--data /Users/gokdenizgulmez/Library/Mobile\ Documents/com\~apple\~CloudDocs/Datastes/data_tyni \
--iters 10 \
--batch-size 1 \
--val-batches 1 \
--adapter-path /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-lora
Loading pretrained model
Loading datasets
Training
Training model with LoRA.
Trainable parameters: 0.022% (0.000M/2.049M)
Starting training..., iters: 10
Iter 1: Val loss 12.459, Val took 0.372s
Iter 2: Saved adapter weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-lora/adapter.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-lora/0000002_adapters.safetensors.
Iter 4: Saved adapter weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-lora/adapter.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-lora/0000004_adapters.safetensors.
Iter 6: Saved adapter weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-lora/adapter.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-lora/0000006_adapters.safetensors.
Iter 8: Saved adapter weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-lora/adapter.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-lora/0000008_adapters.safetensors.
Iter 10: Val loss 12.457, Val took 0.219s
Iter 10: Train loss 12.457, Learning Rate 1.000e-05, It/sec 25.792, Tokens/sec 25253.294, Trained Tokens 9791, Peak mem 6.128 GB
Iter 10: Saved adapter weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-lora/adapter.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-lora/0000010_adapters.safetensors.
Saved final adapter weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-lora/adapter.safetensors.
DORA
python -m mlx_lm.lora \
--model /Users/gokdenizgulmez/Library/Mobile\ Documents/com\~apple\~CloudDocs/Transformer\ Models/Safetensors/tiny-random-GemmaForCausalLM \
--train \
--fine-tune-type dora \
--data /Users/gokdenizgulmez/Library/Mobile\ Documents/com\~apple\~CloudDocs/Datastes/data_tyni \
--iters 10 \
--batch-size 1 \
--val-batches 1 \
--adapter-path /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-dora
Loading pretrained model
Loading datasets
Training
Training model with DoRA.
Trainable parameters: 0.023% (0.000M/2.049M)
Starting training..., iters: 10
Iter 1: Val loss 12.460, Val took 0.241s
Iter 2: Saved adapter weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-dora/adapter.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-dora/0000002_adapters.safetensors.
Iter 4: Saved adapter weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-dora/adapter.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-dora/0000004_adapters.safetensors.
Iter 6: Saved adapter weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-dora/adapter.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-dora/0000006_adapters.safetensors.
Iter 8: Saved adapter weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-dora/adapter.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-dora/0000008_adapters.safetensors.
Iter 10: Val loss 12.457, Val took 0.164s
Iter 10: Train loss 12.457, Learning Rate 1.000e-05, It/sec 26.589, Tokens/sec 26033.065, Trained Tokens 9791, Peak mem 6.129 GB
Iter 10: Saved adapter weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-dora/adapter.safetensors and /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-dora/0000010_adapters.safetensors.
Saved final adapter weights to /Users/gokdenizgulmez/Desktop/tinyGemma-pretrain-dora/adapter.safetensors.
Thanks!! Will review shortly!
Thanks!! Will review shortly!
Perfect!
Any idea how to correct this error?
File ".../venv/lib/python3.8/site-packages/mlx/nn/utils.py", line 34, in wrapped_value_grad_fn
value, grad = value_grad_fn(model.trainable_parameters(), *args, **kwargs)
RuntimeError: QuantizedMatmul::vjp no gradient wrt the quantized matrix yet.
You can't fine-tune the quantized layers. You can use a fp16, bf16, or fp32 model for full fine-tuning. The half precision types need care to avoid numerical issues, so ymmv. If you want to use a quantized model you can do QLoRA.
You can't fine-tune the quantized layers. You can use a fp16, bf16, or fp32 model for full fine-tuning. The half precision types need care to avoid numerical issues, so ymmv. If you want to use a quantized model you can do QLoRA.
OK. Thanks for the info.
I probably won't attempt full tuning 16 bit models on my 2021 M1
I was trying to fine tune a LoRA on mlx-community/DeepSeek-V2-Lite-Chat-4bit-mlx
but failed with this error:
File ".../llms/mlx_lm/tuner/utils.py", line 132, in linear_to_lora_layers
raise ValueError(f"Lora does not support {model.model_type}")
ValueError: Lora does not support deepseek_v2
@Jonathan-Dobson here is a fix for that https://github.com/ml-explore/mlx-examples/pull/932. Will put it in a new pypi release once it lands.
@Jonathan-Dobson here is a fix for that #932. Will put it in a new pypi release once it lands.
#932 fixed the error and allows Fine Tuning to start now.
Given a --fine-tune-type full training and the saved model in adapters directory,
When attempting to use generate.py like this:
python -m mlx_lm.generate \
--model mlx-community/Qwen2-0.5B \
--prompt $P \
--adapter-path adapters
The command fails with this Error:
File ".../venv/lib/python3.8/site-packages/mlx/nn/layers/base.py", line 204, in load_weights
weights = list(mx.load(weights).items())
RuntimeError: [load_safetensors] Failed to open file adapters/adapters.safetensors
Here are the contents of adapters after the full fine tune with updated lora.py:
adapter_config.json
model.safetensors
It looks like running the full type saves the name as model now, but the generator.py still depends on the original adapters name.
Alternatively, running generate.py with adapter/ path as the model still causes this error:
File ".../llms/mlx_lm/utils.py", line 346, in load_config
with open(model_path / "config.json", "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'adapters/config.json'
It’s not saving all the model files correctly, what I tried is dragging the original config and tokenizer files to this adapters folder and then generating without the —adapter flag (and adding the adapters path to the—model flag), because it’s saving the full model weights, this flag is only needed for LoRA fine tuning. I’ll push the fix later, thanks for the feedback. Try it again after renaming the model.safetensor to adapters.safetensor this should swap the old model weights with the new ones.
Hey I'm back and it's fixed.
It now saves the full model with it's needed files like config.json, tokenizer, and so on.
You can now just generate with:
python -m mlx_lm.generate \
--model path/to/adapters \
--prompt $P
Hey @awni, I want to ask if I need to do or change something for it to be merged?
Hey @awni, I want to ask if I need to do or change something for it to be merged?
Apologies for the delay. Let me take a look this week and get back to you!
@awni Thanks for the quick reply! removed unnecessary code, merged the lora and dora cases and added the new lora layers. Sorry for the bad code. I think the merge was not correctly.
@awni @Goekdeniz-Guelmez it could be nice to add a FINETUNE.md to mlx-examples/llms/mlx_lm imho
@awni @Goekdeniz-Guelmez it could be nice to add a
FINETUNE.mdto mlx-examples/llms/mlx_lm imho
A nice and detailed description on how to fine-tunine is already in the LORA.md file here. Do you mean there shoud be a seperate and more detailed explaination?
@awni @Goekdeniz-Guelmez it could be nice to add a
FINETUNE.mdto mlx-examples/llms/mlx_lm imhoA nice and detailed description on how to fine-tunine is already in the LORA.md file here. Do you mean there shoud be a seperate and more detailed explaination?
Yeah I meant a separate file, I think full fine-tuning warrants its own guide as it's a feature long wanted for MLX. Great work btw 👏