Prince Canuma issues

Results 63 issues of


                                            Prince Canuma

Improvements: Papers

VisionZip A simple yet effective method that selects a set of informative tokens for input to the language model, reducing visual token redundancy and improving efficiency while maintaining model performance....

Nan loss when training Llama-3.2-vision

## Issue I keep getting `nan` loss when training Llama-3.2-vision I tried: - gradient clipping - lower learning rate - higher batch size, lora rank and alpha But with no...

enhancement

Add ChatterBox

This is just a starting point, @BenLumenDigital is taking care of it. ## Checklist - [ ] Tests added/updated - [ ] Documentation updated - [ ] Issue referenced (e.g.,...

Add support for PlayDiffusion

Uses BigVGAN codec so all you need to add is the discrete diffusion, encoder and conditioning model https://huggingface.co/PlayHT/PlayDiffusion

Add Batch Generation

Supported models: - Qwen3 VL + MoE - Idefics 2 & 3 Closes #40 #48

[WIP] Reduce deps core

**Summary:** This PR removes the dependency on `torch`, `torchvision`, and `transformers` by porting the necessary processors directly into `mlx-vlm`. It also restructures `pyproject.toml` to support optional installations. **Changes:** * **Removed...

Remove mask parameter

Closes #594

Unify video generate into generate.py

High-level idea would be to define message format at the model level as a property (i.e., `get_messages`) that exists for models that support video or raise issue for models that...

enhancement

good first issue

Could not find proper documentation for finetuning supported models

### Discussed in https://github.com/Blaizzy/mlx-vlm/discussions/476 Originally posted by **avishekjana** August 27, 2025 Hi, I’m trying to fine-tune LLaMA 3.2 Vision and Qwen 2 VL, but the main challenge I’m facing is...

documentation

Add UI startup option to Server

Closes #270