Awni Hannun
Awni Hannun
Just that: https://github.com/ml-explore/mlx/blob/main/python/src/array.cpp#L672-L684
Double isn’t possible in Metal. In theory we could do it on the CPU only, but that is likely a lot less interesting to you?
Sounds good, I'll leave this open for now as a possible enhancement. I don't know if we will do it, but people can comment here with use cases etc to...
It is a limitation of the hardware / metal stack. It's unlikely we will have a float64 GPU back-end anytime in the fore-seeable future. A `float64` CPU is doable.. but...
> Would MLX support for float64 on CPU offer any benefit over using numpy when converting back-and-forth between float64 and float32 in scenarios of mixed precision codes where the bulk...
>Implementation of explicit_gemm_conv_ND_cpu. However, using this seems to be considerably slower than the naive implementation. I guess that materializing the strided input view takes very long. The actual gemm is...
@mlaves are you planning to update this?
> Should I remove this now or is there any chance the reshaping will be faster in the future? It's possible, we haven't looked at / optimized the CPU copies...
Sounds good to me!
The op should be in C++ and then do a binding (we try to keep the C++ and Python APIs reasonably consistent). I think the Python impl from @angeloskath is...