HIP
HIP copied to clipboard
Call hipDeviceGetAttribute with hipDeviceAttributeTotalConstantMemory option return negative value.
my code is:
int main(){
int constant_memory = 0;
hipDeviceGetAttribute(&constant_memory,
hipDeviceAttributeTotalConstantMemory,
0);
std::cout<<constant_memory<<std::endl;
}
The Rocm version is 4.2. Run this code and see the constant_memroy is a negative value: "-16777216". What's meaning of this number? Is't a normal rerturn? Thanks.
There is no limitation on constant buffer on AMD HW and it is equal to the max single allocation, which is about the total memory size. Hence you see an overflow, where runtime can't fit unsigned 64bit value into a signed 32 bit. In general this query was somewhat meaningful for >12 years old HW.
Will you change the type of "pi" from int to unsigned 64 for the hipError_t hipDeviceGetAttribute(int* pi, hipDeviceAttribute_t attr, int deviceId) ?
I'm not sure about that, since HIP matches CUDA on API's side and I don't see any reason to add extra extension. IMO, this particular query is useless on any modern AMD GPU. You could just ignore it.
Okay. Could you please suggest which function should be called to query total constant memory size and whether the useless function would be deleted from HIP APIs ? Thanks.
hipDeviceGetLimit() with hipLimitMallocHeapSize. That's the closest what the app can query right now, because I couldn't find the max single alloc query in HIP. The returned value should be the total heap size. In other words there is no limit.
not familiar with the function, I took a look at it. I hope your answer implies that more queries will be added. Thanks.
https://rocm-developer-tools.github.io/HIP/group__Device.html#ga8edc85bb9637d6b1eda0d064d141a255
Allocate any size there is no limit, except max single allocation size on GPU, which is almost the same as the total device memory size I really don't understand why you need specific size for constant memory. Only very old HW had different path for constant buffers and limitation on the size. There is no any "advantage" or limitation for constant buffers now. If you will have any issues with the actual allocation, then we could take a look.
It may helpful to explain the difference of constant buffer size between an Nvidia GPU and an AMD GPU at https://rocmdocs.amd.com/en/latest/ROCm_API_References/HIP_API/Device-management.html
Thanks.
https://github.com/intel/llvm/pull/5168