Awni Hannun comments

Results 1014 comments of


                                            Awni Hannun

[BUG] Missing types

We have stubs in Python MLX distribution which should tell the IDEs what the types are. If you look in the path `python -c "import os; import mlx.core as mx;...

[BUG] Missing types

It could be a bug with the type hints.

[BUG] Missing types

This should be fixed in the latest MLX. Please let us know if you still see a problem in the stubs.

[Feature] Metal inverse (`mx.linalg.inv`)

Yes, this isn't a bug, the GPU back-end is not yet implemented. It's most likely going to take some time before we have GPU support for matrix inversion. I changed...

[Feature] Metal inverse (`mx.linalg.inv`)

My recommendation is to use the CPU for now. You can do something like: ```python out = mx.llinalg.inv(x, stream=mx.cpu) ``` Just for that operation.

[Feature] Metal inverse (`mx.linalg.inv`)

No update sorry. It's available on the CPU for now, use e.g. `stream=mx.cpu`

[Feature] Metal inverse (`mx.linalg.inv`)

Maybe you are using an old version of MLX: ``` >>> mx.linalg.inv(mx.ones((2, 2))) Traceback (most recent call last): File "", line 1, in ValueError: [linalg::inv] This op is not yet...

[Question] Performance of mx.fast.scaled_dot_product_attention

The performance differences you are seeing is likely due to implicit casting. When you call sdpa it will promot `q`, `k`, `v` and `mask` to a common data type. If...

[Question] Performance of mx.fast.scaled_dot_product_attention

> Could you help me understand why implicit casting inside sdpa seems slower than explicit casting outside of it? I didn't expect it to make such a difference. Sorry, there's...

[Question] Performance of mx.fast.scaled_dot_product_attention

I don't see the same results on my M1 Max. They look pretty similar though there is some variance in the timings in general: ``` q mlx.core.bfloat16, k mlx.core.bfloat16, v...