blis icon indicating copy to clipboard operation
blis copied to clipboard

Add support for `c_next` in the `auxinfo_t` struct.

Open fgvanzee opened this issue 3 years ago • 1 comments

This branch contains preliminary support for a new .c_next field within the auxinfo_t struct. It is fully implemented for gemm. Caveats:

  • The "wrap-around" address computation for the edge cases is not yet verified (but should be close to correct).
  • For now, only the gemm macrokernel (bli_gemm_ker_var2()) sets the .c_next field. The gemmt, trmm, and trsm macrokernels are (for now) oblivious.

(h/t to @devinamatthews and AMD for their contributions to this feature)

Note: I think we should wait until some of @devinamatthews's pending changes (which impact the non-gemm macrokernels) are merged before we extend this to the other level-3 operations. (I'm referring specifically to de-macroification.)

fgvanzee avatar May 11 '22 20:05 fgvanzee

Note to self: Credit LeickR in the final squashed commit log.

fgvanzee avatar Jun 15 '22 21:06 fgvanzee