hcc icon indicating copy to clipboard operation
hcc copied to clipboard

HIP's __shfl can produce incorrect ISA/results

Open jszuppe opened this issue 7 years ago • 9 comments

We don't know why, but sometimes HIP's shuffle functions can produce incorrect results. We assume that incorrect ISA is produced. If we replace those __shfl* functions with HCC's hc::__shfl*, everything works correctly.

We were not able to produce a minimal test case, but if you remove || defined(__HIP_PLATFORM_HCC__) in rocprim/include/rocprim/intrinsics/warp_shuffle.hpp, then __shlf* functions are used in rocPRIM HIP tests and those tests fail.

jszuppe avatar May 04 '18 13:05 jszuppe

Hi @jszuppe I am running into an issue building rocPRIM:

root@29f5659f5f83:~/rocPRIM/build# cmake -DBUILD_BENCHMARK=ON ../.
CMake Error at cmake/Dependencies.cmake:138 (find_package):
  Could not find a package configuration file provided by "ROCM" with any of
  the following names:

    ROCMConfig.cmake
    rocm-config.cmake

  Add the installation prefix of "ROCM" to CMAKE_PREFIX_PATH or set
  "ROCM_DIR" to a directory containing one of the above files.  If "ROCM"
  provides a separate development package or SDK, be sure it has been
  installed.
Call Stack (most recent call first):
  CMakeLists.txt:63 (include)


-- Configuring incomplete, errors occurred!
See also "/root/rocPRIM/build/CMakeFiles/CMakeOutput.log".
See also "/root/rocPRIM/build/CMakeFiles/CMakeError.log".

aaronenyeshi avatar May 24 '18 18:05 aaronenyeshi

Okay I was able to fix this with rocm-cmake installation suggested by SiuChi

aaronenyeshi avatar May 24 '18 19:05 aaronenyeshi

I've observed that using __shlf* functions will fail many of the rocPRIM tests compared to hc:__shlf*. I'll look into this issue

aaronenyeshi avatar May 24 '18 20:05 aaronenyeshi

I've noticed that the LLVM IR generated in the two cases are different, but it seems that HIP only has wrappers for which uses hc::__shfl : https://github.com/ROCm-Developer-Tools/HIP/blob/master/src/device_util.cpp#L905

aaronenyeshi avatar May 24 '18 22:05 aaronenyeshi

I've noticed that the LLVM IR generated in the two cases are different, but it seems that HIP only has wrappers for which uses hc::__shfl : https://github.com/ROCm-Developer-Tools/HIP/blob/master/src/device_util.cpp#L905

Indeed, we've seen that too. The difference is that in HIP __shfl comes from a library we link against, and in HC hc::__shfl is in a header.

jszuppe avatar May 25 '18 07:05 jszuppe

Hi @jszuppe. Seems that there is come issue with HIP when linking to the library. Moving the HIP's __shfl* functions into the header inlined will fix this issue. Please see: https://github.com/ROCm-Developer-Tools/HIP/pull/470

aaronenyeshi avatar May 25 '18 19:05 aaronenyeshi

For this issue, please follow: https://github.com/ROCm-Developer-Tools/HIP/pull/515 . Shfl will be added to HIP headers soon.

aaronenyeshi avatar Jun 27 '18 19:06 aaronenyeshi

Any update on that? ROCm-Developer-Tools/HIP#515 is still merged only in master. Which ROCm version will include that?

jszuppe avatar Aug 22 '18 09:08 jszuppe

It should be in ROCm 2.x branch of HIP. @mangupta to clarify

aaronenyeshi avatar Oct 15 '18 20:10 aaronenyeshi