cufftShift
cufftShift copied to clipboard
cufftShift_2D_impl() : cufftShift_2D_IP_impl.cu
Hi,
Thanks for providing your FFT Shift implementation!
I am using your software to perform 2D FFT shifts on the NVIDIA TX1. I was originally using the out-of-place version (cufftShift_2D_OP_impl.cu). But I found the out-of-place version caused the NVIDIA TX1 to periodically hang. Best I could tell there was a CPU/GPU synchronization which was hanging the GPU.
Having been unsuccessful in finding the root cause, I switched to the in-place FFT shift and found it did not perform the FFT shift correctly. Only two of the quadrants were shifted (as opposed to all four).
As best I can tell, the following line of code:
kernelConf* conf = cufftShift::GenAutoConf_2D(N/2);
needs to be changed to:
kernelConf* conf = cufftShift::GenAutoConf_2D(N);
I came to this conclusion by comparing the in-place version to the out-of-place version (which works correctly) and noticed the out-of-place version used N instead of N/2.
So in summary, the in-place version now correctly FFT shifts the image and the NVIDIA TX1 GPU does not hang.
Just thought I'd pass on this finding in case others run into a similar issue.
Thanks again for providing this implementation!
Hi Thanks for the interest, but that's quite strange that there is a sync. issue since I have tested the implementation fir the out-of-place kernel and verified the results.
For the in-place, I might need to check it since I have patched that code for some reason and haven't verified it, so I will try it again and correct it.
Thanks for your helpful comments.
The issue may be platform specific. I am using the NVIDIA TX1.
When using the out-of-place version the GPU hang occurs very, very infrequently. If I make the out-of-place call at 10Hz it may hang after 2 minutes or after 2 hours but it will eventually hang.
When using the in-place version, I've been able to run all weekend.
I'll keep up-to-date on any changes you make to the library.
Thanks again! Your library was very useful!