David Bayer
David Bayer
Hi, so as I understand it, you suggest that the structure of the `VkFFTConfiguration` structure could look like: ```cpp typedef struct { // ... pfUINT doublePrecision; pfUINT quadDoubleDoublePrecision; pfUINT quadDoubleDoublePrecisionDoubleMemory;...
More changes: - CMake version is upgraded to version 3.17 - we can use CUDAToolkit instead of depricated CUDA package - unity build can be used to significantly reduce compile...
Hi, I have fixed the casts from different types and added `VKFFT_VENDOR_APPLE` to the enum. I am sorry, I overlooked it completely. Best regards, David
Sure, the mechanism is described here: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#cross-stream-dependencies-and-events It is just extended to work with arbitrary number of streams. David
That would be awesome! However I would consider naming it performR2R[fftDim] in case any other transforms R2R are added in the future. Thank you very much. David
Great job! Thanks a lot.
I created a preview of this feature for vkFFT_InitializeApp module in: https://github.com/DejvBayer/VkFFT/tree/goto-error-handling I think it shortens the code quite a bit and makes it more focused on what is being...
> is this PR still active? I found some use cases in CUB/libcu++, so it would be great if we are able to merge it It is, but there are...
I know about it, however I believe this is a better solution. Thanks for the change!
I believe the problem is that the local type is defined in a `__host__` only function and is not visible during `__device__` compilation. I've slightly modified the provided example here:...