mlx
mlx copied to clipboard
Adding linear algebra and other array operations
It looks like this is still missing many matrix operations like QR, SVD, einsum, etc. Is there a clear path to using these with or without MLX?
This has been a similar issue with the PyTorch MPS backend. While there is a long tail of these operations to support, they are essential to many machine learning models. As can be seen in the PyTorch issue, not including them limits the utility of packages like this.
Huge +1 to this. Would be amazing to not have to drop back to numpy/CPU for these sorts of things.
Hi! I am quite interested to work on this but not really sure how to start. Would someone be able to push me in the right direction?
I would be even open to have a short meeting if required.
I work from a M2 Max. Thank you :)
@awni
matrix factorizations aren't easy parallelizable on the gpu.
would QR and SVD only have cpu implementation for now? @awni
We would love to have these operations available directly in MLX. It's not our top top priority but something we intend to add in the future or even better accept contributions for.
If you are interested in contributing, here are some thoughts:
- To the extent that we can avoid writing these from scratch that is good.
- For the CPU we can use LAPACK and/or Accelerate depending on what's available in each. A good starting point would be to wrap an op from one of those just for the CPU (and throw for the GPU).
- On the GPU there are also some pre-written kernels we can use from MPS for example: (cholesky)[https://developer.apple.com/documentation/metalperformanceshaders/mpsmatrixdecompositioncholesky?language=objc]. You can see an example of how to wrap MPS matmul. The others could be done similarly.
- For ops not supported by MPS, we'd need kernels which is a bigger project, but a fun one for those up for a challenge!
Thoughts on wrapping these linalg specific functions to a separate module on Python frontend?
So you can look at how mlx.core.random
works. We could do something similar for mlx.core.linalg
. Basically it's a nested namespace on the C++ side mlx::core::random
and then we make it a submodule in the pybind11 bindings. Then you can do:
import mlx.core as mx
mx.linalg.< >
Any thoughts on implementing at least vector/matrix norm methods such as torch.linalg.vector_norm?
Something like np.linalg.norm
for vectors and for a matrix Frobenius norm should be very easy to do.. that's also a good place to start just to get the packaging setup.
note to self: almost all LAPACK routines are col-major
@awni would Transpose on an mlx array before sending it to LAPACK routines work here, or is there an alternative way?
No I wouldn't deal with that using a transpose. You can usually call the routine with the right arguments and avoid a transpose. For example a row-major [M, N] matrix is the same as a col major [N, M] matrix in terms of its memory layout.
Hi @awni, may I ask is there any learning resources of Apple Metal and Accelerate Framework? I want to contribute to LinAlg module but I do not know where to start with. For instance, if I want to build mx.linalg.eig
, how can I use LAPACK from apple accelerate framework?
matrix factorizations aren't easy parallelizable on the gpu.
would QR and SVD only have cpu implementation for now? @awni
SVD support would be great.
The CPU versions of these are pretty doable. See the QR factorization as an example https://github.com/ml-explore/mlx/blob/main/mlx/backend/common/qrf.cpp
GPU support is more involved as I don’t think there are many open source Metal implementations