Results 2 issues of ericlars

# Summary [Intel llvm](https://github.com/intel/llvm/commit/dd418459868a976cd2eeae367fea6b92795ea611 ) introduced multiple CUDA streams per SYCL queue which broke the cuSOLVER scope handler that assumed one stream per device per thread. This broke asynchronous submissions,...