mlx-vlm icon indicating copy to clipboard operation
mlx-vlm copied to clipboard

fix rope deltas in training

Open Goekdeniz-Guelmez opened this issue 7 months ago • 8 comments

Goekdeniz-Guelmez avatar Jul 10 '25 13:07 Goekdeniz-Guelmez

addreses #404

Goekdeniz-Guelmez avatar Jul 10 '25 13:07 Goekdeniz-Guelmez

this is not tested, I will do so when I'm home.

Goekdeniz-Guelmez avatar Jul 10 '25 13:07 Goekdeniz-Guelmez

also #409

Goekdeniz-Guelmez avatar Jul 10 '25 13:07 Goekdeniz-Guelmez

this is not tested, I will do so when I'm home.

Is it ready?

Blaizzy avatar Jul 22 '25 00:07 Blaizzy

ping @Goekdeniz-Guelmez

Blaizzy avatar Sep 03 '25 09:09 Blaizzy

shoudl be working, can you try it out too @Blaizzy ?

Goekdeniz-Guelmez avatar Sep 03 '25 09:09 Goekdeniz-Guelmez

python -m mlx_vlm.lora
--model-path mlx-community/Qwen2-VL-2B-Instruct-bf16
--dataset TIGER-Lab/VisualWebInstruct-Seed --dataset-config 'reference'
--output-path /Volumes/T7_Shield/mlx-vlm
--batch-size 1
--steps 20
--learning-rate 1e-4 INFO:main:Loading model from mlx-community/Qwen2-VL-2B-Instruct-bf16 Fetching 11 files: 100%|█████████████████████████████████████████████| 11/11 [00:00<00:00, 13768.23it/s] The image processor of type Qwen2VLImageProcessor is now loaded as a fast processor by default, even if the model checkpoint was saved with a slow processor. This is a breaking change and may produce slightly different outputs. To continue using the slow processor, instantiate this class with use_fast=False. Note that this behavior will be extended to all models in a future release. Fetching 11 files: 100%|█████████████████████████████████████████████| 11/11 [00:00<00:00, 74295.24it/s] INFO:main:Loading dataset from TIGER-Lab/VisualWebInstruct-Seed INFO:main:Applying chat template to the dataset INFO:main:Setting up LoRA #trainable params: 11.54048 M || all params: 1543.714304 M || trainable%: 0.748% INFO:main:Setting up optimizer INFO:main:Setting up trainer INFO:main:Training model {'Epoch': 0, 'Step': 0, 'Loss': '1.2262'}
{'Epoch': 0, 'Step': 10, 'Loss': '1.7623'}
100%|████████████████████████████████████| 20/20 [00:51<00:00, 2.55s/it, Epoch=0, Step=19, Loss=1.9172]

Goekdeniz-Guelmez avatar Sep 03 '25 10:09 Goekdeniz-Guelmez

Could you run:

pre-commit run --all

Blaizzy avatar Oct 03 '25 12:10 Blaizzy