Alexey Bader
Alexey Bader
As I mentioned in my first comment - it's known issue of level zero backend. > 0 means device time counter is zero, that should be the time when the...
`sleep` modifies the execution time of host application. SYCL events return time for things happening on the device, which are executed asynchronously to the host, so adding `sleep` like this...
Quite possible. I see that Level Zero might return different values for different device types, but I'm not sure if the plug-in takes this into account. [out] Returns the resolution...
> This issue was fixed for Intel GPUs but still relevant for NVidia - so we have to fix it I guess. The ticket description doesn't mention that this the...
> Would you be able to share how this was tackled for Intel GPUs, maybe there is something that could be generalized? > Thanks @jchlanda! I'm not a compiler guy,...
I see that this issue also impacts `test_optional_kernel_features`.
The test passes on the same machine with DPC++ built from https://github.com/intel/llvm/commit/bfc7e984592314b6fc6c715f2fe8edf72b1cc6f6. `test_optional_kernel_features` still fails.
@AerialMantis, if you won't be able to reproduce this problem with the recent DPC++, please, close the ticket.
That's very strange as I haven't updated CUDA version, so my theory is that patches from llvm.org and spir-v translator remove uses of atomic add instructions.
@jchlanda, are you able to build `test_optional_kernel_features`?