hipamd icon indicating copy to clipboard operation
hipamd copied to clipboard

Correctly set the index value for __shf_up.

Open jchlanda opened this issue 3 years ago • 1 comments

Please see https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_subgroups.html for the details of the shuffles.

This was uncovered when writing libclc's Intel subgroup shuffles, which use the same built-in bpermute (https://github.com/intel/llvm/pull/4664/files) and was failing tests from llvm-test-suite (among others: https://github.com/intel/llvm-test-suite/blob/intel/SYCL/SubGroup/shuffle.hpp#L88).

jchlanda avatar Oct 06 '21 10:10 jchlanda

@jchlanda, (self & ~(width-1)) is the lowest lane in the group of width lanes that includes self. If index, the source lane, is below that value, then the shuffle up operation for lane self is a no-op. I do not believe the proposed patch is correct.

Suppose width = 4, and self = 2, and lane_delta = 5. Then (self & ~(width-1)) = 0. The first assignment of index results in -3. The proposed patch incorrectly keeps the index at -3, whereas the current code replaces index with 2 which is correct.

b-sumner avatar Feb 07 '22 20:02 b-sumner