HIP Call hipDeviceGetAttribute with hipDeviceAttributeTotalConstantMemory option return negative value.

Call hipDeviceGetAttribute with hipDeviceAttributeTotalConstantMemory option return negative value.

Open malixian opened this issue 3 years ago • 9 comments

my code is:

int main(){
        int constant_memory = 0;
        hipDeviceGetAttribute(&constant_memory,
                              hipDeviceAttributeTotalConstantMemory,
                              0);
        std::cout<<constant_memory<<std::endl;
}

The Rocm version is 4.2. Run this code and see the constant_memroy is a negative value: "-16777216". What's meaning of this number? Is't a normal rerturn? Thanks.

Nov 11 '21 05:11 malixian

There is no limitation on constant buffer on AMD HW and it is equal to the max single allocation, which is about the total memory size. Hence you see an overflow, where runtime can't fit unsigned 64bit value into a signed 32 bit. In general this query was somewhat meaningful for >12 years old HW.

Nov 11 '21 15:11 gandryey

Will you change the type of "pi" from int to unsigned 64 for the hipError_t hipDeviceGetAttribute(int* pi, hipDeviceAttribute_t attr, int deviceId) ?

Nov 22 '21 15:11 zjin-lcf

I'm not sure about that, since HIP matches CUDA on API's side and I don't see any reason to add extra extension. IMO, this particular query is useless on any modern AMD GPU. You could just ignore it.

Nov 22 '21 17:11 gandryey

Okay. Could you please suggest which function should be called to query total constant memory size and whether the useless function would be deleted from HIP APIs ? Thanks.

Nov 22 '21 17:11 zjin-lcf

hipDeviceGetLimit() with hipLimitMallocHeapSize. That's the closest what the app can query right now, because I couldn't find the max single alloc query in HIP. The returned value should be the total heap size. In other words there is no limit.

Nov 26 '21 04:11 gandryey

not familiar with the function, I took a look at it. I hope your answer implies that more queries will be added. Thanks.

https://rocm-developer-tools.github.io/HIP/group__Device.html#ga8edc85bb9637d6b1eda0d064d141a255

Nov 26 '21 15:11 zjin-lcf

Allocate any size there is no limit, except max single allocation size on GPU, which is almost the same as the total device memory size I really don't understand why you need specific size for constant memory. Only very old HW had different path for constant buffers and limitation on the size. There is no any "advantage" or limitation for constant buffers now. If you will have any issues with the actual allocation, then we could take a look.

Nov 26 '21 20:11 gandryey

It may helpful to explain the difference of constant buffer size between an Nvidia GPU and an AMD GPU at https://rocmdocs.amd.com/en/latest/ROCm_API_References/HIP_API/Device-management.html

Thanks.

Nov 29 '21 12:11 zjin-lcf

https://github.com/intel/llvm/pull/5168

Dec 18 '21 02:12 zjin-lcf

HIP HIP copied to clipboard

Call hipDeviceGetAttribute with hipDeviceAttributeTotalConstantMemory option return negative value.

HIP
HIP copied to clipboard