llvm icon indicating copy to clipboard operation
llvm copied to clipboard

[SYCL][CUDA][MATRIX] joint_matrix_bmad implementation

Open JackAKirk opened this issue 3 years ago • 4 comments

cc @dkhaldi

Implementation corresponding to the matrix extension proposal section "Bitwise Multiply and Add" in https://github.com/intel/llvm/pull/4695

Integration tests here: https://github.com/intel/llvm-test-suite/pull/760

JackAKirk avatar Jan 21 '22 17:01 JackAKirk

Hi @dkhaldi

If it is preferred for reviewing purposes I could add the temporary/initial fp19 implementation that uses uint32_t directly to this PR? Hopefully the uint32_t fp19 should be a bit more straightforward to review compared to the bmad cases, since in the end we realized we can implement the fp19 cases in a way which is completely compliant with the existing matrix extension, whereas the bmad cases require a different interface.

Otherwise it is fine to put them up one at a time, I just thought it might make it easier to review them at once.

Thanks

JackAKirk avatar Feb 15 '22 17:02 JackAKirk

If it is preferred for reviewing purposes I could add the temporary/initial fp19 implementation that uses uint32_t directly to this PR? Hopefully the uint32_t fp19 should be a bit more straightforward to review compared to the bmad cases, since in the end we realized we can implement the fp19 cases in a way which is completely compliant with the existing matrix extension, whereas the bmad cases require a different interface.

I think separate PRs is better.

dkhaldi avatar Feb 15 '22 18:02 dkhaldi

I think separate PRs is better.

OK

JackAKirk avatar Feb 16 '22 08:02 JackAKirk

/verify with https://github.com/intel/llvm-test-suite/pull/760

JackAKirk avatar Aug 10 '22 16:08 JackAKirk