HIP-CPU icon indicating copy to clipboard operation
HIP-CPU copied to clipboard

An implementation of HIP that works on CPUs, across OSes.

Results 24 HIP-CPU issues
Sort by recently updated
recently updated
newest added

Consider the following HIP program: ``` #include #include __global__ void my_kernel(int * data_in, int * data_out) { int idx = blockIdx.x * blockDim.x + threadIdx.x; data_out[idx] = __shfl_down(data_in[idx], 16); }...

Hello! We are facing problems compiling HIP-CPU and I am wondering if someone could help us out. Maybe we are missing something... Trying to compile in a machine with Intel...

According to HIP programming guides, [Warp Cross Lane Functions](https://rocmdocs.amd.com/en/latest/Programming_Guides/HIP-GUIDE.html#warp-cross-lane-functions) are well supported in HIP. But I couldn't build the HIP code with some of these wrap functions e.g. `int __all(int...

Hi all, does anyone try on this target in order to run tensorflow etc. on HIP-CPU?

I am having difficulty understanding how the processor_ function interacts with the co_thread and how threadIdx dimensions are used. Is there a thread pool created? Where is the processor_ function...

Currently, a lot of headers include \ to check for availability of parallel algorithms. However, this header is not available in C++17, but only from C++20 onwards.

Trying to compile the HIP program ``` #include __global__ void my_kernel() { extern __shared__ int dyn_shmem[]; } int main() { int dyn_shmem_size = 64; hipLaunchKernelGGL(my_kernel, 4, 32, dyn_shmem_size, 0); hipDeviceSynchronize();...

The values of warpSize read from the hipDeviceProps_t variable and the kernel builtin variable warpSize are different, which is very unexpected. Consider the following HIP program: ``` #include #include __global__...

The following example line of code ``` hipMalloc(&d_x, count * sizeof(float)); ``` fails to compile (using g++ 9.4.0) with error ``` saxpy.hip.cpp:46:15: error: invalid conversion from ‘float**’ to ‘void**’ [-fpermissive]...

Since HIP-CPU uses libco fiber mechanism to mimic GPU thread behaviour, the stack gets constantly re-written and that leads to ASan reporting false positives all over the place. Seems like...