oneMKL
oneMKL copied to clipboard
[CUDA][HIP] Use device to get native context
Since https://github.com/oneapi-src/unified-runtime/pull/999 it is no longer valid to get the native context from the SYCL context on a multi GPU system. The get native func for contexts has been deprecated for this reason. See https://github.com/intel/llvm/pull/10975
Similar ticket: https://github.com/oneapi-src/oneDNN/pull/1765
reading your changes, I have a question.
For example,
auto cudaDevice = sycl::get_nativesycl::backend::ext_oneapi_cuda(queue.get_device());
Is the type of cudaDevice "CUdevice" ?
Hi @jinz2014 yes you are correct!
cufft_run.txt All the DFT changes look good to me and I've run the DFT tests successfully. I'd like to see test logs for the other backends before I approve.
AMD tests for lapack and blas all passing: test_amd.txt
8 lapack nvidia test failing on GTX1050 but these tests are also failing on develop
branch:
test_cuda_lapack.txt
Nvidia blas tests passing test_cuda_blas.txt
I see all the buffer tests failing for the rocblas
backend with PI_ERROR_INVALID_OPERATION
.
Logs: PR_425.txt
The failures are not because of the changes in this PR, but rather a recent change in the compiler. All these tests are expected to pass once https://github.com/oneapi-src/unified-runtime/pull/1226 and https://github.com/intel/llvm/pull/12297 are merged.
Some test results:
CUDA
gtx1050.txt
Some failures due to precision also present on develop
branch.
HIP
gfx90a_oneMKL_test.txt
Test failures in HIP are also present on the develop
branch:
gfx90a_oneMKL_test_develop_branch.txt
I am not sure how to build/run the FFT tests. Are there some build/test instructions that I can follow?
~~In terms of building with icpx 2024.0.0 for CUDA. I am getting a segfault at linking with develop
branch.~~
Fixed. LD_LIBRARY_PATH
problems -_-
I can successfully build this branch with icpx 2024.0.2 for CUDA
Thanks a lot @hdelan ! The instructions are here but need to be improved.
The short answer is that you should just need to add -DENABLE_CUFFT_BACKEND=True -DENABLE_ROCFFT_BACKEND=True
to also test the DFT domain with the native CUDA and HIP backends.
If you are explicitly setting -DTARGET_DOMAINS
in your CMake command you will also need to append dft
to the list, otherwise it will be enabled by default.
If you don't want to build and test the other domains again you can use -DTARGET_DOMAINS=dft
.
Thanks @Rbiessy !
Building rocFFT is broken for me but this PR does not touch that code. Building with cuFFT is OK. Here is updated tests for all oneMKL for CUDA including cuBLAS, cuFFT, cuRAND, cuSOLVER:
Thanks! LGTM
Thanks for the review. Let me know @lhuot or @mmeterel if you need more time, otherwise I will go ahead and merge this on Monday.