Awni Hannun

Results 1014 comments of Awni Hannun

Just FYI we are prototyping an interruptable eval: https://github.com/ml-explore/mlx/pull/1970 Probably better not to add it if it's not necessary.. so if using a smaller chunk size is sufficient that would...

Looks good! Can you run the formatting hook then we can merge it?

Which model were you running?

Can you provide more details on what exactly you are running? Most likely the line it's breaking at is in the [new KV cache](https://github.com/ml-explore/mlx-swift-examples/blob/main/Libraries/LLM/KVCache.swift#L85-L86). It looks like one of the...

> I tested this with Phi 3.5 mini and Llama 3.1 9B, and it mostly seems to work, but on longer, multi-turn prompts I got garbled output from Phi 3.5...

Indeed.. the T5 models typically don't work well in fp16. Probably they need some kind of activation clipping or rescaling to fix this. `mx.bfloat16` should work though.

Regarding the multimodal vision model, I'd recommend filing an issue on https://github.com/Blaizzy/mlx-vlm Regarding the nemotron 51B, we could add it. I will mark this as a feature request. Just FYI...

That's a good point... we should probably fix it.

This exact issue was fixed in https://github.com/ml-explore/mlx/pull/2242 - but it hasn't been released yet. Indeed a good workaround is to use a more verbose name.

It seems reasonable to me! If you are interested to submit a PR that would be great.