VkFFT icon indicating copy to clipboard operation
VkFFT copied to clipboard

how to dump the generated source kernel?

Open tingxingdong opened this issue 1 year ago • 6 comments

I can see a cubin/AMD binary dumped after runing the testSuite Portal. but i do not see the source kernel dumped. Where can i see the source kernel?

tingxingdong avatar Mar 07 '24 07:03 tingxingdong

Enable the keepShaderCode parameter in configuration and the test suite will print out all executed kernels.

DTolm avatar Mar 07 '24 08:03 DTolm

vim vkFFT/vkFFT/vkFFT_Structs/vkFFT_Structs.h and pfUINT keepShaderCode = 1;//will keep shader code and print all executed shaders during the plan execution in order (0 - off, 1 - on)

will cause the testSuite portal stops working and quickly return.

tingxingdong avatar Mar 07 '24 08:03 tingxingdong

vim benchmark_scripts/vkFFT_scripts/src/user_benchmark_VkFFT.cpp
add configuration.keepShaderCode = 1;

I do not see CUDA source kernel dumped under the folder.

where are them?

tingxingdong avatar Mar 07 '24 09:03 tingxingdong

i mean the CUDA/hip source kernel not the VkFFT_binary generated under the folder.

tingxingdong avatar Mar 07 '24 09:03 tingxingdong

 grep -r -i "keepShaderCode" *
benchmark_scripts/vkFFT_scripts/src/user_benchmark_VkFFT.cpp:                   configuration.keepShaderCode = 1;
benchmark_scripts/vkFFT_scripts/src/sample_14_precision_VkFFT_single_nonPow2.cpp:                       configuration.keepShaderCode = 1;
benchmark_scripts/vkFFT_scripts/src/sample_51_convolution_VkFFT_single_3d_matrix_zeropadding_r2c.cpp:   convolution_configuration.keepShaderCode = 1;
benchmark_scripts/vkFFT_scripts/src/sample_15_precision_VkFFT_single_r2c.cpp:                   configuration.keepShaderCode = 1;
benchmark_scripts/vkFFT_scripts/src/sample_16_precision_VkFFT_single_dct.cpp:                           configuration.keepShaderCode = 1;    
```  yet, still not see any kernel print out

tingxingdong avatar Mar 07 '24 09:03 tingxingdong

You need to modify the configuration struct in the example you try to execute, not in the struct definition. I suggest opening sample_11_precision_VkFFT_single.cpp for the power of 2 cases and doing it there.

DTolm avatar Mar 07 '24 09:03 DTolm