oneMKL
oneMKL copied to clipboard
unsupported getri_batch/getrf_batch for Nvidia
As per the oneMKL-LAPACK
APIs "getri_batch/getrf_batch" are not implemented for Nvidia.
void geqrf_batch(sycl::queue &queue, std::int64_t m, std::int64_t n,
sycl::buffer<std::complex
void getri_batch(sycl::queue &queue, std::int64_t n, sycl::buffer
Can you please let us know when these APIs support will be available?
We have implemented work around by using "SYCL interop", will you be interested in this?
Hi @Soujanyajanga, thanks for raising this issue, I can give you an update on the progress of these operations for the Nvidia backend.
For getrf_batch
Nvidia supports an equivalent to getrf
but not to getrf_batch
so to support this we need to implement it by manually batching the regular getrf
implementation, and there is a pull request open just now which does this - https://github.com/oneapi-src/oneMKL/pull/209.
For getri_batch
, however, Nvidia does have an equivalent to getri_batch
but it's provided in cuBLAS rather than cuSOLVER, which means this would require some changes to the Nvidia backend. Unfortunately we don't have any immediate plans to do this, however, we could incorporate this into our future roadmap.
Edit: I originally stated that getri_batch
was not provided by Nvidia, but it is in fact provided, but it's in cuBLAS rather than cuSOLVER.
We have implemented work around by using "SYCL interop", will you be interested in this?
I have a quick question, what native function have you been using as your work around? So far as I am aware getri does not have a native cuSolver equivalent.
I have managed to answer my own question, cuSolver does not implement getri
cuBlas does.
https://docs.nvidia.com/cuda/cublas/index.html#cublas-lt-t-gt-getribatched
I think this is something we can support but I think it may take a bit of additional work in the backend to get the appropriate cuBlas handles, etc.
We have implemented work around by using "SYCL interop", will you be interested in this?
I have a quick question, what native function have you been using as your work around? So far as I am aware getri does not have a native cuSolver equivalent.
For "getri_batch", CUDA equivalent API is "cublasCgetriBatched" . We have integrated CUDA APIs with SYCL interop as a workaround. Are you interested, in this approach?
@Soujanyajanga yes, I think this is the approach we would take, if you can share your workaround this could be useful, thanks, I've added this to our roadmap so someone will take a look at.
@Soujanyajanga yes, I think this is the approach we would take, if you can share your workaround this could be useful, thanks, I've added this to our roadmap so someone will take a look at.
Here below the work around implemented using SYCL interop static sycl::queue *handle; error = (handle = &dpct::get_default_queue(), 0);
……………..(creating/adjusting the parameters for CUDA API) ……………... ………………
cublasStatus_t err1;
cublasHandle_t handle_cuda1;
CUstream streamId1 = sycl::get_native<sycl::backend::cuda>(*handle);
err = cublasCreate(&handle_cuda1);
err = cublasSetStream(handle_cuda1, streamId1);
err = cublasCgetriBatched(handle_cuda, n, (cuFloatComplex **)A_array, n, dipiv, (cuFloatComplex **)Ainv_array, n, dinfo_array, batch);