Rafal Bielski
Rafal Bielski
> Maybe we just need a check here to ensure that the pointer is properly aligned, in addition to the existing check that the pattern size is supported? > >...
Thank you @martygrant and apologies I missed your PRs. Both look good to me!
This work has been moved to https://github.com/uxlfoundation/oneMath/pull/699
Hi @jgtong, here's my commands and outputs on H100: ```console $ icpx --version Intel(R) oneAPI DPC++/C++ Compiler 2025.1.0 (2025.1.0.20250317) $ sycl-ls [opencl:cpu][opencl:0] Intel(R) OpenCL, Intel(R) Xeon(R) Gold 5418Y OpenCL 3.0...
Hi @jgtong, I re-tested this with the same compiler version as yours on the following systems: * NVIDIA H100 + Intel Xeon Gold 5418Y, CUDA Toolkit 12.8, cuDNN 9.7.0, CUDA...
Hi @jgtong, that commit doesn't include #97 as I opened both PRs at the same time independently. How is it possible that it doesn't crash for you on dereferencing the...
The PR is now rebased on the latest `main` after #97
Could you add the same in the CUDA and HIP versions for consistency of the outputs?