SYCLomatic
SYCLomatic copied to clipboard
Signed-off-by: chenwei.sun
Signed-off-by: Jiang, Zhiwei
Signed-off-by: Daiyaan Ahmed
Signed-off-by: Tang, Jiajun [email protected]
In Progress PR for Load/Store header functions for Block API (related later to #1305 ) cc @yihanwg @danhoeflinger @mmichel11
There is some performance gap between the cuda and dpct programs when they run on an NVIDIA GPU (e.g. 3090). Thanks for your review. DPCT program:https://github.com/zjin-lcf/HeCBench/blob/master/src/streamCreateCopyDestroy-sycl/main.cpp Create+Copy+Synchronize+Destroy time for 1...
From email discussion on SYCLcompat for cooperative group apis, this is created to add all header algorithms related to block/warp for sycl. This PR only replaces the group apis -...