gtensor
gtensor copied to clipboard
alternate fft backends
The bbfft (https://github.com/intel/double-batched-fft-library) backend has a performance advantage on some Intel GPU platforms, over the default oneMKL backend. bbfft has an OpenCL backend, it would be interesting to see if it could be built on AMD platforms and provide an advantage there.
VkFFT is also worth trying: https://github.com/DTolm/VkFFT. I made an initial attempt and had trouble getting it to build, so it may require more work, but I have heard reports of good performance exceeding even cuFFT on nvidia in certain cases.
Intel and AMD platforms are more important here initially, because cuFFT has been around for a long time and has had more time to mature, so harder to beat.