iree
iree copied to clipboard
Better error message when device not found
Request description
In this issue,
https://github.com/nod-ai/SHARK-Platform/issues/264
I encountered an error message that looked like
ValueError: <vm>:0: NOT_FOUND; HAL device `__device_0` not found or unavailable: #hal.device.target<"hip", {legacy_sync}, [#hal.executable.target<"rocm", "rocm-hsaco-fb", {iree.gpu.target = #iree_gpu.target<arch = "gfx1100", features = "", wgp = <compute = fp64|fp32|fp16|int64|int32|int16|int8, storage = b64|b32|b16|b8, subgroup = shuffle|arithmetic, dot = dp4xi8toi32, mma = [<WMMA_F32_16x16x16_F16>, <WMMA_F16_16x16x16_F16>, <WMMA_I32_16x16x16_I8>], subgroup_size_choices = [32, 64], max_workgroup_sizes = [1024, 1024, 1024], max_thread_count_per_workgroup = 1024, max_workgroup_memory_bytes = 65536, max_workgroup_counts = [2147483647, 2147483647, 2147483647]>>, ukernels = "none"}>]>;
It would be very very nice if upon encountering an error like this, iree could enumerate the available devices and give something like "you request device x but we only have devices Y, Z, and W. Did you mean to call function f with argument device=y instead of device=x?"
What component(s) does this issue relate to?
Runtime
Additional context
No response