mlx-swift-examples Support for Batch Generation

Support for Batch Generation

Open rudrankriyam opened this issue 1 month ago • 3 comments

I found the PR by Awni cool in mlx-lm about batch generation and was experimenting with it over the weekend. I was able to implement it with almost same benchmark numbers on my M5 MacBook Pro with Llama 3.2 3B 4-bit:

Batch Size	MLX LM (t/s)	MLX Swift (t/s)
1	61	62
2	122	118
32	349	344

There are subtle improvements that I have not been able to find but I think a review would help me out.

Creating this issue in-case if somebody else is already working on it. If not, I can clean up my branch and send a PR!

Nov 03 '25 18:11 rudrankriyam

mlx-swift-examples mlx-swift-examples copied to clipboard

Support for Batch Generation

mlx-swift-examples
mlx-swift-examples copied to clipboard