llvm icon indicating copy to clipboard operation
llvm copied to clipboard

[SYCL][HIP] support for HIP group ballot

Open abagusetty opened this issue 2 years ago • 5 comments

abagusetty avatar Sep 08 '22 15:09 abagusetty

To address #6718

abagusetty avatar Sep 08 '22 15:09 abagusetty

Does this support ballot over a wavefront of size 64 ? Thanks.

zjin-lcf avatar Sep 13 '22 21:09 zjin-lcf

@zjin-lcf You are right, it needs the size of 64, which is here: https://github.com/intel/llvm/blob/sycl/sycl/include/sycl/ext/oneapi/sub_group_mask.hpp#L28

I am not sure how to change the sub group mask size to 64 in a proper way for HIP backend. That is what is blocking this draft to be complete.

abagusetty avatar Sep 13 '22 21:09 abagusetty

There's a SPIR-V call for the max sub-group size, so essentially the wavefront size:

  • https://github.com/intel/llvm/blob/sycl/libclc/amdgcn-amdhsa/libspirv/workitem/get_max_sub_group_size.cl

You may also be able to use the __AMDGCN_WAVEFRONT_SIZE macro.

npmiller avatar Sep 14 '22 13:09 npmiller

Does this imply that sub_group_mask is unreliable in general on all hardware with more than 32 wide sub groups?

stefanatwork avatar Sep 22 '22 08:09 stefanatwork

Hi all, any updates on this?

nsirgien avatar Oct 04 '22 17:10 nsirgien

Hello, quick update on this so after looking into it a bit further, the implementation in this PR can easily be tweaked to support 64 bits sub-groups, however I found out that the sub-group mask class in the header was backed by a 32 bits integer, so it couldn't support masks for 64 bit wavefront.

I've opened a separate PR building on top of this one to modify the sub-group mask class in the headers to allow 64 threads support, see:

  • https://github.com/intel/llvm/pull/7491

npmiller avatar Nov 22 '22 17:11 npmiller

Closing it since it is done here: #7491

abagusetty avatar Jan 23 '23 22:01 abagusetty