[SYCL] Avoid infinite loop when kernel fails to compile with memory error
The runtime tries to build a kernel again if compilation fails. But if UR returns a memory error the attempt counter was not compared against the maximum number of attempts, so the compiler was continuously called and eventually the loop counter would have overflowed.
@VerenaBeckham is it possible to add a test that would fail without your patch and work after your patch?
@VerenaBeckham is it possible to add a test that would fail without your patch and work after your patch?
I'm not sure. Even if you could write a test that allocates so much memory that it runs out, this test might still pass on a different architecture with more memory. Or do you know of a way to mock up this error?
@VerenaBeckham is it possible to add a test that would fail without your patch and work after your patch?
I'm not sure. Even if you could write a test that allocates so much memory that it runs out, this test might still pass on a different architecture with more memory. Or do you know of a way to mock up this error?
I think a unit test with UrMock should do the trick for that: https://github.com/intel/llvm/blob/b437083ea7b827ea6798e3fcfae1fe0f926d1d6d/sycl/doc/developer/ContributeToDPCPP.md#dpc-headers-and-runtime-tests
You can find examples if you grep UrMock within sycl/unittests. Basically, UrMock allows you to redefine the behavior of any UR function for a given test, so you could redefine the function that is triggering the memory error to always return a memory error without even trying to allocate.
@VerenaBeckham is it possible to add a test that would fail without your patch and work after your patch?
I'm not sure. Even if you could write a test that allocates so much memory that it runs out, this test might still pass on a different architecture with more memory. Or do you know of a way to mock up this error?
I think a unit test with
UrMockshould do the trick for that: https://github.com/intel/llvm/blob/b437083ea7b827ea6798e3fcfae1fe0f926d1d6d/sycl/doc/developer/ContributeToDPCPP.md#dpc-headers-and-runtime-testsYou can find examples if you grep
UrMockwithinsycl/unittests. Basically,UrMockallows you to redefine the behavior of anyURfunction for a given test, so you could redefine the function that is triggering the memory error to always return a memory error without even trying to allocate.
Thanks for the hint! I have added a test.
If you're happy with this @maarquitos14 do you want to "approve the workflows" and kick off the testing?
I'm sorry, I forgot to push my clang format fixes. 🤦 Fixed now.