Fraser Cormack
Fraser Cormack
Description ----------- Without the barrier at the end of barrierOR, it is possible for work-item 0 to start the next loop iteration and update predicates[0] while other work-items are still...
This might be a simple question to answer as it's arguably the status quo in the specification, but because it has caused a discussion internally I'd like to ask for...
This is mostly just a copy of the CUDA version of this implementation.
When configured without match files, the CTS was trying to run a command and its arguments as a single command, as it was represented by a single string variable. It...
Also trim down a number of HIP 'program' match entries which appear to be passing now.
Also fix up the `DeadArgumentElimination` passes to correctly preserve the annotations; when removing arguments from functions, dead parameters need pruned and alive ones may need their values shifted down by...
We need this to successfully compile the subgroup device_code test.
Copies code for the pool descriptor from the CUDA adapter. Fixes https://github.com/oneapi-src/unified-runtime/issues/1479.
This patch adds two kernel properties to allow users to specify the maximum work-group size that a kernel will be invoked with. The `max_work_group_size` property corresponds to the `intel::max_work_group_size` function...