Awni Hannun

Results 1014 comments of Awni Hannun

> I'm not sure if it makes sense to implement such a hybrid in mlx, as it would require a lot of synchronization points. @awni what do you think? Indeed...

CC @jagrit06 any reason that one is not instantiated for bfloat? Also @kaeru-shigure how did you come across this? Could you share the program you ran?

One challenge here is that FFT is not yet supported on the GPU (in Metal). So you could use it but on the CPU it would almost certainly be much...

Ok sounds good! Thanks for the benchmarks, that's really interesting!

One option is to update the CPU convolution to dispatch to an FFT implementation when the input sizes make sense. We would want to benchmark it in a few settings...

Looks like it's not applying the metal-cpp patch properly. Could you try a couple things: 1. Wipe the build/ directory and try again? 2. Check the output of `which patch`...

🤷‍♂️ not sure why it's not applying the patch. You could try changing this line https://github.com/ml-explore/mlx/blob/main/CMakeLists.txt#L97 to ``` PATCH_COMMAND /usr/bin/patch -N -i ${METAL_CPP_PATCH} || true ``` If that doesn't work,...

Good to know. We may need to hard code that path. I don't know why the other version of `patch` doesn't work ..

Yes.. @amirhossein-razlighi let us know if you are working on it already.