blis
blis copied to clipboard
Add support for `c_next` in the `auxinfo_t` struct.
This branch contains preliminary support for a new .c_next field within the auxinfo_t struct. It is fully implemented for gemm. Caveats:
- The "wrap-around" address computation for the edge cases is not yet verified (but should be close to correct).
- For now, only the
gemmmacrokernel (bli_gemm_ker_var2()) sets the.c_nextfield. Thegemmt,trmm, andtrsmmacrokernels are (for now) oblivious.
(h/t to @devinamatthews and AMD for their contributions to this feature)
Note: I think we should wait until some of @devinamatthews's pending changes (which impact the non-gemm macrokernels) are merged before we extend this to the other level-3 operations. (I'm referring specifically to de-macroification.)
Note to self: Credit LeickR in the final squashed commit log.