JackAKirk comments

Results 147 comments of


                                            JackAKirk

[SYCL][CUDA] Layout accumulator is specified at load/store.

Updated usage of joint_matrix can be seen in the changes here: https://github.com/intel/llvm-test-suite/pull/1183.

[SYCL][CUDA] Layout accumulator is specified at load/store.

@dkhaldi @yubingex007-a11y @gmlueck matrix-unified.hpp contains the agreed interfaces for joint_matrix_load, joint_matrix_store and joint_matrix_mad. These functions call backend implementations depending on compiler flags \_\_NVPTX\_\_ \_\_SPIR\_\_ (and later we can also add...

[SYCL][DOC] Initial commit of oneapi extension proposal for adding P2P

Thanks for posting this. I have a few questions: 1. Just to be completely clear: Is the implication that if `ext_oneapi_can_access_peer(device_b)` returns true when called from device_a then users can...

[SYCL][DOC] Initial commit of oneapi extension proposal for adding P2P

I think this looks quite good from the point of view of the cuda backend (apart from the one issue I describe below). I can try a simple implementation to...

[SYCL][DOC] Initial commit of oneapi extension proposal for adding P2P

> * Your comments about the interaction between buffer-copy optimization and the P2P API assume that APIs in this extension are implemented by directly calling CUDA APIs. Instead, the extension...

[SYCL] Fix marray math function impls

/verify with https://github.com/intel/llvm-test-suite/pull/1002

[SYCL] Fix marray math function impls

I've added scalar_vector_* lists in this PR that omit marray types, so that math functions can distinguish the marray implementations I added. The type lists including marrays, used in e.g....

[SYCL] Fix marray math function impls

/verify with https://github.com/intel/llvm-test-suite/pull/1002

[SYCL] Fix marray math function impls

> /verify with [intel/llvm-test-suite#1002](https://github.com/intel/llvm-test-suite/pull/1002) FYI I don't have access to see the failures from this. The tests are passing locally for cuda.

[SYCL] Fix marray math function impls

> Can we extend existing tests to capture the new sizes? I did not find any existing tests for marray math builtins: this makes sense since the existing implementation was...