HIP icon indicating copy to clipboard operation
HIP copied to clipboard

Request: implement hipOccupancyMaxPotentialBlockSize for AMD GPUs

Open kingcrimsontianyu opened this issue 6 years ago • 10 comments

Occupancy calculator API is an invaluable asset in CUDA. Unfortunately hipOccupancyMaxPotentialBlockSize is only exposed to Nvidia GPUs for the time being. It would be immensely helpful if it is implemented for AMD GPUs.

kingcrimsontianyu avatar Feb 21 '19 17:02 kingcrimsontianyu

I second this request.

Created an internal ticket SWDEV-180694 to track it. It'd be highly desirable to have this API implemented so machine learning frameworks can properly schedule available GPU resources efficiently.

whchung avatar Feb 21 '19 17:02 whchung

relevant code in TensorFlow:

https://github.com/ROCmSoftwarePlatform/tensorflow-upstream/blob/develop-upstream/tensorflow/core/util/gpu_launch_config.h#L165

Without this function implemented in HIP the grid / block size selection on AMD hardware would always be sub-optimal.

whchung avatar Mar 15 '19 16:03 whchung

I think this can be closed as of ROCm 2.7?
https://github.com/ROCm-Developer-Tools/HIP/blob/854768787ee9bbd6ed22b3e8fd0f139955a57e6a/src/hip_module.cpp#L1015

jeffdaily avatar Sep 20 '19 15:09 jeffdaily

The HIP implementation is not comparable to the corresponding CUDA function, which takes a function so that the dynamic shared memory can be a function of the block size.

image

Cc: @nbeams

jedbrown avatar Feb 13 '22 21:02 jedbrown

I would further clarify that we would like a HIP version for the driver API function cuOccupancyMaxPotentialBlockSize, which I believe corresponds to cudaOccupancyMaxPotentialBlockSizeVariableSMem in the runtime API.

nbeams avatar Feb 14 '22 16:02 nbeams

I see this was left as a TODO in https://github.com/ROCm-Developer-Tools/HIP/pull/1943/files#diff-9ec4991aeca8528b60eaf6d00b089eecda171d49742e348561c957c5fa2000feR1328-R1342

@gargrahul Can you suggest a workaround?

jedbrown avatar Feb 14 '22 18:02 jedbrown

Hello, I was wondering if this is still being worked on? It's been 2 years since last update here, and unless I have pretty bad user error, it's still not working (somehow breaking calls that occur before I even call it)

0x0015 avatar Apr 21 '24 20:04 0x0015