Negative padding
So I have this error during training. I had to reduce batch size to 1 to fix it. But I wonder what would be the correct way to resolve this?
INFO:main:Training model {'Epoch': 0, 'Step': 0, 'Loss': '10.9777'} {'Epoch': 0, 'Step': 10, 'Loss': '5.8438'} 0%| | 15/8317 [00:22<3:31:28, 1.53s/it, Epoch=0, Step=14, Loss=4.8478] Traceback (most recent call last): File "
", line 198, in _run_module_as_main File " ", line 88, in _run_code File "mlx_vlm/lora.py", line 178, in main(args) File "mlx_vlm/lora.py", line 98, in main loss = trainer.train_step( ^^^^^^^^^^^^^^^^^^^ File "mlx_vlm/trainer/trainer.py", line 265, in train_step loss, grads = loss_and_grad_fn(self.model, batch) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "mlx/nn/utils.py", line 35, in wrapped_value_grad_fn value, grad = value_grad_fn(model.trainable_parameters(), *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "mlx/nn/utils.py", line 29, in inner_fn return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^ File "mlx_vlm/trainer/trainer.py", line 230, in loss_fn outputs = model(input_ids, pixel_values, attention_mask, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "mlx_vlm/models/qwen2_vl/qwen2_vl.py", line 116, in call input_embddings = self.get_input_embeddings( ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "mlx_vlm/models/qwen2_vl/qwen2_vl.py", line 78, in get_input_embeddings final_inputs_embeds = self._merge_input_ids_with_image_features( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "mlx_vlm/models/qwen2_vl/qwen2_vl.py", line 96, in _merge_input_ids_with_image_features image_features = mx.pad(image_features, ((0, 0), (0, pad_size), (0, 0))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ValueError: Invalid high padding size (-60) passed to pad for axis 1. Padding sizes must be non-negative
Hey @pavelgur
Could you share a reproducible script, dataset and model?
The trainer is definetly due to a overhaul. The initial version has some limitation in batch size and multi-image for certain models
I changed the logic to a non-padding, please check it out and let me know if it works for you #227