SYCLomatic
SYCLomatic copied to clipboard
Syclomatic supports part of the functions in https://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH__INTRINSIC__SIMD.html The unsupported functions are listed below. Thanks. ``` DPCT1007:0: Migration of __viaddmax_s16x2 is not supported. DPCT1007:1: Migration of __viaddmax_s16x2_relu is not supported....
Considering following case in DeepSpeed kernel, a global function template with parameter pack ``` template __global__ void multi_tensor_apply_kernel( int chunk_size, volatile int* noop_flag, T tl, U callable, ArgTypes... args) {...
Add Migration of CUB Block Exchange API. cc @yihanwg
The bfloat16 class has been non-experimental for a while now, supporting all backends: https://github.com/oneapi-src/SYCLomatic/pull/1286 However SYCLomatic appears to be not be using this, and instead just always casting to float,...
Add Migration of cub::store API . Linked to #1819 cc @yihanwg , @zhimingwang36
The pre-requisite section of the README document for SYCLomatic repo doesn't call out the CUDA headers dependency and versions supported (https://github.com/oneapi-src/SYCLomatic#prerequisites) The post migration SYCL can be targeted for non-Intel...
Migrating a function from CUDA to DPCT shows that the result is not complete (e.g. kernel name is missing). Please see the following code snippets from the program (https://github.com/zjin-lcf/HeCBench/blob/master/ssim-cuda/utils.h). Could...
Please see the example https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/examples.html#example-1-single-process-single-thread-multiple-devices
Please see the example https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/examples.html#example-1-single-process-single-thread-multiple-devices Thanks.
**Is your feature request related to a problem? Please describe** Syclomatic translates memory marked with `__constant__` in CUDA as just a standard marked read only SYCL buffer/accessor. This means that...