cufft_examples
cufft_examples copied to clipboard
cuFFT and cuFFTDx example
Getting Started
These examples utilize the following toolsets:
- cuFFT
- cuFFTDx (Requires joining CUDA Math Library Early Access Program) https://developer.nvidia.com/CUDAMathLibraryEA
- C++11
- CUB 1.13+
Hardware
Volta+
cuFFT_vs_cuFFTDx
This code runs three scenarios
- cuFFT using cudaMalloc
- cuFFT using cudaMallocManaged
- cuFFTDx using cudaMalloc
Objectives
- Compare coding styles between cuFFT, using cudaMalloc and cudaMallocManaged
- Compare performance between cuFFT, using cudaMalloc and cudaMallocManaged
- Compare performance and results between cuFFT and cuFFTDx
Execution
For float
mkdir build;
cd build
cmake -DCMAKE_CUDA_ARCHITECTURES=75 -DCUB_DIR=${HOME}/workStuff/git_examples/cub -DCUFFTDX_DIR=${HOME}/workStuff/cufft/libcufftdx/include ..
make -j
If you don't pass -DCMAKE_CUDA_ARCHITECTURES=XX versions CC60, CC70, CC75, and CC80 will be built.
Output
$ D2Z_Z2D/D2Z_Z2D
cufftExecD2Z/Z2D - FFT/IFFT - Managed 29.65 ms
cufftExecD2Z/Z2D - FFT/IFFT - Managed 20.99 ms
cufftExecC2C - FFT/IFFT - Dx 23.31 ms
Compare results [Malloc/Managed]
All values match!
Compare results [Malloc/Dx]
All values match!
Notes
- This code utilizes cuFFT Callbacks
- https://devblogs.nvidia.com/cuda-pro-tip-use-cufft-callbacks-custom-data-processing/
- This code utilizes separate compilation and linking
- https://devblogs.nvidia.com/separate-compilation-linking-cuda-device-code/