SYCLomatic Codes not migrated

Please see the migration result. dpct version 17.0.0. Codebase:(426a0560b12914f001e8980152d9447a28ed7a4b)

Thanks.

input file: test.cuh
#include <cuda.h>

template <typename T>
inline __device__ T shfl(const T val, int srcLane, int width = 32) {
#if CUDA_VERSION >= 9000
    return __shfl_sync(0xffffffff, val, srcLane, width);
#else
    return __shfl(val, srcLane, width);
#endif
}

template <typename T>
inline __device__ T* shfl(T* const val, int srcLane, int width = 32) {
    static_assert(sizeof(T*) == sizeof(long long), "pointer size");
    long long v = (long long)val;

    return (T*)shfl(v, srcLane, width);
}

output:
 #include <sycl/sycl.hpp>
#include <dpct/dpct.hpp>

template <typename T>
inline T shfl(const T val, int srcLane, int width = 32) {
#if CUDA_VERSION >= 9000
    return __shfl_sync(0xffffffff, val, srcLane, width);
#else
    return __shfl(val, srcLane, width);
#endif
}

template <typename T>
inline T* shfl(T* const val, int srcLane, int width = 32) {
    static_assert(sizeof(T*) == sizeof(long long), "pointer size");
    long long v = (long long)val;

    return (T*)shfl(v, srcLane, width);
}

Jul 06 '23 16:07 jinz2014

I added a CUDA example (https://github.com/zjin-lcf/HeCBench/tree/master/warpselect-cuda) for your reference. Then, I changed the generated DPCT codes manually for the shuffle functions and the copy constructor of the Tensor class. However, the remaining SYCL compiler errors are:

static assertion failed due to requirement 'is_device_copyable<const faiss::gpu::Tensor<float, 2, true, int, faiss::gpu::traits::DefaultPtrTraits>, void>::value || detail::IsDeprecatedDeviceCopyable<const faiss::gpu::Tensor<float, 2, true, int, faiss::gpu::traits::DefaultPtrTraits>, void>::value': The specified type is not device copyable
  static_assert(is_device_copyable<FieldT>::value ||
                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The tensor class is defined in Tensor.cuh (Tensor.dp.hpp) and Tensor-inl.cuh (Tensor.in.dp.hpp). My understanding is that a Tensor object is not copyable in SYCL while it is copyable in the CUDA program. Thank you for your suggestions.

Jul 07 '23 22:07 jinz2014

@jinz2014
#1for the first reproducer test.cuh, it has been reproduced, the root cause it that the template function shfl is not instantiated, the function call to _shfl_sync/_shfl is UnresolvedLookupExpr, therefore, not processed by SYCLomatic.

The WA is to adding caller function to instantiate template function shfl.

#2 For the second reproducer "https://github.com/zjin-lcf/HeCBench/tree/master/warpselect-cuda", as https://github.com/zjin-lcf/HeCBench/blob/8514c8b89a547ceb7b55eb54888dc5325c6078d8/warpselect-cuda/Tensor.cuh#L95C5-L95C11, https://github.com/zjin-lcf/HeCBench/blob/8514c8b89a547ceb7b55eb54888dc5325c6078d8/warpselect-cuda/Tensor.cuh#L95C5-L95C11, https://github.com/zjin-lcf/HeCBench/blob/8514c8b89a547ceb7b55eb54888dc5325c6078d8/warpselect-cuda/Tensor.cuh#L107, https://github.com/zjin-lcf/HeCBench/blob/8514c8b89a547ceb7b55eb54888dc5325c6078d8/warpselect-cuda/Tensor.cuh#L103 breaks C++ trivially copyable requirements ( https://en.cppreference.com/w/cpp/types/is_trivially_copyable), and SYCL spec does not allow to pass as kernel argument.

WA for you is to change the origin cuda code, then migrate it again.

Jul 11 '23 06:07 tomflinda

Have you solved it? I have the same problem

Jan 24 '24 10:01 lwqq00

@lwqq00 I reopened the issue.

Mar 03 '24 15:03 jinz2014

@lwqq00 I would try to change the cuda code

Mar 03 '24 17:03 jinz2014

@tomflinda

Users may or may not change CUDA codes. On the other hand, may you please suggest code changes needed in the SYCL program to work around the trivially copyable issue ? I assume that developers of the Intel extension for Pytorch encountered the same issue before. Thanks.

Mar 05 '24 15:03 jinz2014

SYCLomatic SYCLomatic copied to clipboard

Codes not migrated

SYCLomatic
SYCLomatic copied to clipboard