Cabana icon indicating copy to clipboard operation
Cabana copied to clipboard

Efficient memory use in FFTs

Open sfogerty opened this issue 3 years ago • 3 comments

Simple improvements to memory use in the Cabana FFT implementation could have an outsize impact on performance.

  1. Get rid of the data copies. Instead of converting between various types of complex data, we could use Kokkos::complex<Scalar>. The issue with this was the alignment in Kokkos::complex, but could be resolved with -DKOKKOS_ENABLE_COMPLEX_ALIGN=ON

  2. heFFTe may be allocating a work buffer for each FFT? If so we should pass it a work buffer to use. This could speed things up significantly.

sfogerty avatar Nov 18 '21 20:11 sfogerty

@streeve @sslattery I made an issue here to try capturing potential improvements for FFT performance.

sfogerty avatar Nov 18 '21 20:11 sfogerty

With #451 merged can you start on these @sfogerty? Ideally for each performance update you can run the test on each backend and show the improvement. I have a small python script to compare if that's useful

streeve avatar Dec 01 '21 19:12 streeve

@sfogerty is it relatively straightforward to add the other optimization here?

streeve avatar Jun 03 '22 13:06 streeve