SYCLomatic icon indicating copy to clipboard operation
SYCLomatic copied to clipboard

Results 132 SYCLomatic issues
Sort by recently updated
recently updated
newest added

Syclomatic supports part of the functions in https://docs.nvidia.com/cuda/cuda-math-api/group__CUDA__MATH__INTRINSIC__SIMD.html The unsupported functions are listed below. Thanks. ``` DPCT1007:0: Migration of __viaddmax_s16x2 is not supported. DPCT1007:1: Migration of __viaddmax_s16x2_relu is not supported....

enhancement

Considering following case in DeepSpeed kernel, a global function template with parameter pack ``` template __global__ void multi_tensor_apply_kernel( int chunk_size, volatile int* noop_flag, T tl, U callable, ArgTypes... args) {...

enhancement

Add Migration of CUB Block Exchange API. cc @yihanwg

The bfloat16 class has been non-experimental for a while now, supporting all backends: https://github.com/oneapi-src/SYCLomatic/pull/1286 However SYCLomatic appears to be not be using this, and instead just always casting to float,...

enhancement

Add Migration of cub::store API . Linked to #1819 cc @yihanwg , @zhimingwang36

The pre-requisite section of the README document for SYCLomatic repo doesn't call out the CUDA headers dependency and versions supported (https://github.com/oneapi-src/SYCLomatic#prerequisites) The post migration SYCL can be targeted for non-Intel...

bug

Migrating a function from CUDA to DPCT shows that the result is not complete (e.g. kernel name is missing). Please see the following code snippets from the program (https://github.com/zjin-lcf/HeCBench/blob/master/ssim-cuda/utils.h). Could...

bug

Please see the example https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/examples.html#example-1-single-process-single-thread-multiple-devices

enhancement

Please see the example https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/examples.html#example-1-single-process-single-thread-multiple-devices Thanks.

enhancement

**Is your feature request related to a problem? Please describe** Syclomatic translates memory marked with `__constant__` in CUDA as just a standard marked read only SYCL buffer/accessor. This means that...

enhancement