HeCBench
HeCBench copied to clipboard
bitonic-sort: hoist sycl::queue out of timing
bitonic-sort-sycl is a lot slower than the HIP version on chipStar targeting the same device through OpenCL. I guess, it's because sycl::queue creation initiates the SYCL runtime initialization while chipStar runtime is (currently) initialized before dropping to the main() function.
This patch puts CUDA, HIP and SYCL on the same line by initializing the runtime before starting measurements.