clvk icon indicating copy to clipboard operation
clvk copied to clipboard

Correct values for device queries that are not available in Vulkan

Open jrprice opened this issue 4 years ago • 2 comments

Several clGetDeviceInfo queries such as CL_DEVICE_GLOBAL_MEM_CACHE_SIZE and CL_DEVICE_MAX_COMPUTE_UNITS currently have placeholder values such as 0 or 1, since there is no way to query these device properties from Vulkan (AFAIK).

Some OpenCL applications use these (and other) queries for heuristics, which can lead to clvk performing very poorly in certain situations. For example, MACE uses the cache size and the number of compute units to select a work-group size for some of its kernels. With the current placeholder values, this results in suboptimal work-group sizes such as (1,2,1), and an order of magnitude performance degradation compared to if the correct values are used.

@kpet Do you have any thoughts about how to get clvk to report better values for these queries? The simplest approach would be to hardcode the right values for any known devices (returning the placeholder values and emitting a warning for unknown devices). A dynamic solution would of course be better (something like hwloc?) but I don't know if there is anything that would work on all the platforms and devices that we care about.

jrprice avatar Aug 14 '20 15:08 jrprice

Looking at what MACE is doing, I'd say the ideas are sound but OpenCL doesn't give enough guarantees and information for this to be portable. It may work well on some devices but this is not a generic solution.

Long-term, I want something like https://gitlab.khronos.org/vulkan/vulkan/-/merge_requests/3190. I've made a bit more progress on this than what is visible in the PR but unfortunately don't have too much time to spend on this at the moment.

As to what we could do short-term, I'm fine with having tables for known devices in clvk as long as we clearly document where the values are coming from. However, even that is not always simple. Some devices (e.g. Mali devices) can return different values depending on how they've been integrated in the platform. We'd have to look at metadata beyond the device name / ID to report accurate numbers.

kpet avatar Aug 19 '20 18:08 kpet

Looking at what MACE is doing, I'd say the ideas are sound but OpenCL doesn't give enough guarantees and information for this to be portable. It may work well on some devices but this is not a generic solution.

Sure, I'm not endorsing the specific heuristics used by MACE, but as long as these queries exist people will find (questionable) ways to use them :-)

Long-term, I want something like https://gitlab.khronos.org/vulkan/vulkan/-/merge_requests/3190. I've made a bit more progress on this than what is visible in the PR but unfortunately don't have too much time to spend on this at the moment.

Thanks for the link, this definitely looks like what we'd want in the future.

As to what we could do short-term, I'm fine with having tables for known devices in clvk as long as we clearly document where the values are coming from. However, even that is not always simple. Some devices (e.g. Mali devices) can return different values depending on how they've been integrated in the platform. We'd have to look at metadata beyond the device name / ID to report accurate numbers.

Good point. On Android devices we could get the specific chipset or product model via __system_property_get and use that (probably only necessary for Mali?).

jrprice avatar Aug 19 '20 19:08 jrprice