mlx-swift-examples icon indicating copy to clipboard operation
mlx-swift-examples copied to clipboard

Support for Batch Generation

Open rudrankriyam opened this issue 1 month ago • 3 comments

I found the PR by Awni cool in mlx-lm about batch generation and was experimenting with it over the weekend. I was able to implement it with almost same benchmark numbers on my M5 MacBook Pro with Llama 3.2 3B 4-bit:

Batch Size MLX LM (t/s) MLX Swift (t/s)
1 61 62
2 122 118
32 349 344

There are subtle improvements that I have not been able to find but I think a review would help me out.

Creating this issue in-case if somebody else is already working on it. If not, I can clean up my branch and send a PR!

rudrankriyam avatar Nov 03 '25 18:11 rudrankriyam