FFTW.jl
FFTW.jl copied to clipboard
support split complex arrays (separate real/imag arrays)
MethodError: no method matching plan_fft(::StructVector{ComplexF64, NamedTuple{(:re, :im), Tuple{Vector{Float64}, Vector{Float64}}}, Int64}, ::UnitRange{Int64}; flags=0x00000000)
The best approach is for StructArrays
to add an FFTW extension
Yes, this could utilize the FFTW's fftw_plan_guru_split_dft
and fftw_execute_split_dft
interface to transform the split complex format directly.
Although specifically supporting StructArrays.jl is something for an extension, we could help out in FFTW.jl by adding plan_fft_split
and execute_fft_split
functions, that take separate real and imaginary arrays as arguments. This might be useful for other people as well, and will allow StructArrays to avoid low level ccall
s.
I have a use case for this:
One of the common operations in GPS signal processing is signal correlation in frequency domain by doing fft -> multiply by complex conjugate of fft(local signal) -> ifft -> abs2 (for some operations, for example for signal acquisition)
But for LoopVectorization or SIMD in general to work with complex multiplication we need to deinterleave the complex array. Then we need to reinterleave it so we can feed it to ifft.
So what we have now is
interleaved array -> fft -> deinterleave -> complex conjugate SIMD multiply -> interleave -> ifft -> deinterleave -> SIMD abs2
I haven't see FFTW source code myself, but it might do deinterleaving and interleaving inside, so we have so many useless data movement:
interleaved array ->(deinterleave -> fft -> interleave) -> deinterleave -> complex conjugate SIMD multiply -> interleave -> (deinterleave -> ifft -> interleave) -> deinterleave -> SIMD abs2
If we have FFTW split complex interface implemented in FFTW.jl
, we have a cleaner workflow:
interleaved array -> deinterleave -> fft -> complex conjugate SIMD multiply -> ifft -> SIMD abs2
A higher level interface would certainly be useful
But for LoopVectorization or SIMD in general to work with complex multiplication we need to deinterleave the complex array. Then we need to reinterleave it so we can feed it to ifft.
Is the time for complex multiplication significant compared to the time for the FFT?
(Note that the FFT performance may suffer for split complex arrays, IIRC.)