Awni Hannun
Awni Hannun
We have stubs in Python MLX distribution which should tell the IDEs what the types are. If you look in the path `python -c "import os; import mlx.core as mx;...
It could be a bug with the type hints.
This should be fixed in the latest MLX. Please let us know if you still see a problem in the stubs.
Yes, this isn't a bug, the GPU back-end is not yet implemented. It's most likely going to take some time before we have GPU support for matrix inversion. I changed...
My recommendation is to use the CPU for now. You can do something like: ```python out = mx.llinalg.inv(x, stream=mx.cpu) ``` Just for that operation.
No update sorry. It's available on the CPU for now, use e.g. `stream=mx.cpu`
Maybe you are using an old version of MLX: ``` >>> mx.linalg.inv(mx.ones((2, 2))) Traceback (most recent call last): File "", line 1, in ValueError: [linalg::inv] This op is not yet...
The performance differences you are seeing is likely due to implicit casting. When you call sdpa it will promot `q`, `k`, `v` and `mask` to a common data type. If...
> Could you help me understand why implicit casting inside sdpa seems slower than explicit casting outside of it? I didn't expect it to make such a difference. Sorry, there's...
I don't see the same results on my M1 Max. They look pretty similar though there is some variance in the timings in general: ``` q mlx.core.bfloat16, k mlx.core.bfloat16, v...