JackAKirk
JackAKirk
Signed-off-by: JackAKirk # Description This is a bug fix for failures first identified since the multi-streams implementation of the cuda backend in intel/llvm (failures identified here https://github.com/oneapi-src/oneMKL/pull/209#issuecomment-1192001880): The failed tests...
cc @dkhaldi Implementation corresponding to the matrix extension proposal section "Bitwise Multiply and Add" in https://github.com/intel/llvm/pull/4695 Integration tests here: https://github.com/intel/llvm-test-suite/pull/760
I've added the bmad (nvptx backend only) feature description that has made use of precision::b1 in the implementation here: https://github.com/intel/llvm/pull/5363 I can also add some other nvidia specific information as...
Fixes a bug where if `joint_matrix_load` attempts to load `joint_matrix` from an array of `const T`incorrect behaviour will occur or an error will be thrown. To fix this we make...
This is a general fix for https://github.com/intel/llvm/issues/6055. CUDA device interop is not available yet but a corresponding fix will be added to the CUDA specialization of `make_device` in https://github.com/intel/llvm/pull/6202 shortly....
This PR aims to fix issue : https://github.com/intel/llvm/issues/5991 and provide efficient working marray math function implementations for all backends. marray math function support is currently switched on for {n} ({n}...
This is a move towards the future looking joint_matrix, joint_matrix_load, joint_matrix_store APIs. The aim is to make the CUDA and Intel implementations of the joint_matrix extension use matching interfaces, whilst...
# Description In some cases cuSolver operations can return a successful error code while failing. The previous implementation of this check is done via SYCL and requires the CPU to...
- make relaxed fence a no op to satisfy the SYCL spec. - make acquire/release/acq_rel use the lighter acq_rel fence for sm_70 instead of the seq_cst fence.
If for example i add `assert(0);` to the kernel in the vectorAdd sample: https://github.com/ROCm-Developer-Tools/HIP-Examples/blob/master/vectorAdd/vectoradd_hip.cpp via ``` diff --git a/vectorAdd/vectoradd_hip.cpp b/vectorAdd/vectoradd_hip.cpp index 0362c8a..a20bd2c 100644 --- a/vectorAdd/vectoradd_hip.cpp +++ b/vectorAdd/vectoradd_hip.cpp @@ -47,7 +47,7...