XRT
XRT copied to clipboard
large overhead for launching kernel
I am using the U55c board and vitis 2022.1, I have found that the kernel launching latency is very large (50us~100us), this may cause a big differnece when using the API clGetEventProfilingInfo() to measure kernel execution time.
For example, when using clGetEventProfilingInfo() to measure a kernel execution time, I got 762us, as follow:
But the kernel acutally only runs for 649us as follow:
The will cause a BW estimation of 350GB/s and 410GB/s, respectively.
Is there anyway to minize the launching overhead for kernels ?