mlx Feature Request: Software emulation of mlx.float64 on GPU

Hello,

I am very excited that MLX currently supports mx.float64 on CPU. I know that Metal does not support float64. However, I believe it can be added with software emulation. It would be extremely helpful to optimization and inverse design problems to add this feature. float32 is just not enough for simulating physics (for example, optical ray tracing, image processing, VR simulations) and running large data on the CPU is slow.

In my opinion, having an option to run float64 on GPU is the one of the remaining big differences between PyTorch and MLX. I've mainly switched to MLX, but running into accuracy errors because of float32 is starting to be more of an issue.

Thank you

Feb 25 '25 23:02 kyrollosyanny

Emulating FP64 on the GPU is going to be quite slow and there's a good chance it will wipe out any speed improvements you might expect from running on the GPU.

I think your best bet for running locally in higher precision is:

Find a way to make the CPU faster. If there are specific ops that are slow, file an issue and we can look into speeding them up.
Offload parts of your computation that can be lower precision (hopefully large matrix multiplies) to the GPU and then run the higher precision stuff on the CPU.

Feb 26 '25 14:02 awni

Got it. Naive question, When you say locally, does it mean there is a way to run MLX on the cloud in higher precision? Thanks a lot

Feb 26 '25 16:02 kyrollosyanny

What I meant by that is any framework that uses an Apple gpu wiil have the same problem including PyTorch's MPS back-end (which does not support double for the same reason).

Feb 26 '25 17:02 awni

I'm closing this issue since we are not going to implement it for the reasons above, and there is not enough discussion to reconsider this.

Dec 03 '25 00:12 zcbenz