fix rope deltas in training
addreses #404
this is not tested, I will do so when I'm home.
also #409
this is not tested, I will do so when I'm home.
Is it ready?
ping @Goekdeniz-Guelmez
shoudl be working, can you try it out too @Blaizzy ?
python -m mlx_vlm.lora
--model-path mlx-community/Qwen2-VL-2B-Instruct-bf16
--dataset TIGER-Lab/VisualWebInstruct-Seed --dataset-config 'reference'
--output-path /Volumes/T7_Shield/mlx-vlm
--batch-size 1
--steps 20
--learning-rate 1e-4
INFO:main:Loading model from mlx-community/Qwen2-VL-2B-Instruct-bf16
Fetching 11 files: 100%|█████████████████████████████████████████████| 11/11 [00:00<00:00, 13768.23it/s]
The image processor of type Qwen2VLImageProcessor is now loaded as a fast processor by default, even if the model checkpoint was saved with a slow processor. This is a breaking change and may produce slightly different outputs. To continue using the slow processor, instantiate this class with use_fast=False. Note that this behavior will be extended to all models in a future release.
Fetching 11 files: 100%|█████████████████████████████████████████████| 11/11 [00:00<00:00, 74295.24it/s]
INFO:main:Loading dataset from TIGER-Lab/VisualWebInstruct-Seed
INFO:main:Applying chat template to the dataset
INFO:main:Setting up LoRA
#trainable params: 11.54048 M || all params: 1543.714304 M || trainable%: 0.748%
INFO:main:Setting up optimizer
INFO:main:Setting up trainer
INFO:main:Training model
{'Epoch': 0, 'Step': 0, 'Loss': '1.2262'}
{'Epoch': 0, 'Step': 10, 'Loss': '1.7623'}
100%|████████████████████████████████████| 20/20 [00:51<00:00, 2.55s/it, Epoch=0, Step=19, Loss=1.9172]
Could you run:
pre-commit run --all