Awni Hannun

Results 1014 comments of Awni Hannun

Oh interesting. So it only hangs if you run over the network? It might be good to put an eval after you reduce the grads so we can see if...

Have you tried something really simple just to debug the connection? Something like the following: ``` import mlx.core as mx world = mx.distributed.init() x = mx.distributed.all_sum(mx.ones(10)) print(world.rank(), x) ```

So it works fine for me even over a network. It is just quite slow. Could you try the same script but with smaller sizes (like decrease the input and...

Thanks!! Will review shortly!

You can't fine-tune the quantized layers. You can use a fp16, bf16, or fp32 model for full fine-tuning. The half precision types need care to avoid numerical issues, so ymmv....

@Jonathan-Dobson here is a fix for that https://github.com/ml-explore/mlx-examples/pull/932. Will put it in a new pypi release once it lands.

> Hey @awni, I want to ask if I need to do or change something for it to be merged? Apologies for the delay. Let me take a look this...

Back to draft for a few. The rotating buffer doesn't play well with the step prefill for long prompts.. so that needs some work.

Ok so I think this can be reviewed and merged. A little note on the "infinite KV cache": For simplicity it separates the cache growth into two stages: prefill (i.e....