JackAKirk

Results 147 comments of JackAKirk

Updated usage of joint_matrix can be seen in the changes here: https://github.com/intel/llvm-test-suite/pull/1183.

@dkhaldi @yubingex007-a11y @gmlueck matrix-unified.hpp contains the agreed interfaces for joint_matrix_load, joint_matrix_store and joint_matrix_mad. These functions call backend implementations depending on compiler flags \_\_NVPTX\_\_ \_\_SPIR\_\_ (and later we can also add...

Thanks for posting this. I have a few questions: 1. Just to be completely clear: Is the implication that if `ext_oneapi_can_access_peer(device_b)` returns true when called from device_a then users can...

I think this looks quite good from the point of view of the cuda backend (apart from the one issue I describe below). I can try a simple implementation to...

> * Your comments about the interaction between buffer-copy optimization and the P2P API assume that APIs in this extension are implemented by directly calling CUDA APIs. Instead, the extension...

/verify with https://github.com/intel/llvm-test-suite/pull/1002

I've added scalar_vector_* lists in this PR that omit marray types, so that math functions can distinguish the marray implementations I added. The type lists including marrays, used in e.g....

/verify with https://github.com/intel/llvm-test-suite/pull/1002

> /verify with [intel/llvm-test-suite#1002](https://github.com/intel/llvm-test-suite/pull/1002) FYI I don't have access to see the failures from this. The tests are passing locally for cuda.

> Can we extend existing tests to capture the new sizes? I did not find any existing tests for marray math builtins: this makes sense since the existing implementation was...