VkFFT icon indicating copy to clipboard operation
VkFFT copied to clipboard

How to reduce the initialization time?

Open zhaohaifei opened this issue 2 years ago • 12 comments

The initialization time of plan is too long. How to reduce this time? For example, can the compiled kernel be saved? I saw the saveApplicationToString option, which can save the entire plan. Is there any other way?

zhaohaifei avatar Sep 20 '23 07:09 zhaohaifei

Hello,

No, there is no other way. saveApplicationToString only saves the binaries, not the plan and this is precisely what you are looking for.

Best regards, Dmitrii

DTolm avatar Sep 25 '23 21:09 DTolm

Is it possible to save the internally compiled kernel function in a directory so that it can be used directly next time without compiling it again? Is such a solution feasible in vkfft? Is it difficult?

zhaohaifei avatar Oct 09 '23 06:10 zhaohaifei

Hello,

saveApplicationToString saves the binaries, which you can later load with loadApplicationFromString configuration option. See pages 64-65 of the documentation.

Best regards, Dmitrii

DTolm avatar Oct 09 '23 08:10 DTolm

save-application cannot adapt to all situations. I want to save the internal kernel, not the application. I compile all the kernels in advance, and any subsequent application that requires the same kernel can be used directly without compiling again.

zhaohaifei avatar Oct 09 '23 08:10 zhaohaifei

I am sorry, I don't understand. Which situations it can't adapt to? Please provide an example configuration. From what I read, this functionality is exactly what you want to do.

DTolm avatar Oct 09 '23 09:10 DTolm

I want to implement a function such that any size and any stride can be quickly initialized. My approach is to prepare all the kernels in advance and put them in a directory. When you use it later, you don't need to compile it again. It is impossible to save all apps, there are too many apps. But the internal kernel is universal and limited, and can be saved in advance.

zhaohaifei avatar Oct 09 '23 09:10 zhaohaifei

The internal kernel is not universal and can't be saved in advance. The thing that you call kernel is a sequence of CPU calls that create the code for a particular FFT and compile it later.

DTolm avatar Oct 09 '23 09:10 DTolm

I extracted the generic kernel and compiled it in advance. Subsequent can call directly without compilation at runtime. Can such a feature be achieved by modifying the code? If possible, I'll try to try it.

zhaohaifei avatar Nov 06 '23 08:11 zhaohaifei

No, it is not possible to create an uberkernel that will work for all system configurations - it will require a big redesign of the library and won't work with all the algorithms.

DTolm avatar Nov 06 '23 20:11 DTolm

Just for HIP. And it’s ok if it can cover most of the algorithms. Can such a feature be implemented?

zhaohaifei avatar Nov 07 '23 01:11 zhaohaifei

This feature is not on the radar of my development, as it will require too much time to implement for no particular benefits. If you want to experiment with it - you are free to do so.

DTolm avatar Nov 07 '23 07:11 DTolm

As long as it can be achieved and it takes no more than one month, that I will give it a try.

zhaohaifei avatar Nov 07 '23 07:11 zhaohaifei