blis
blis copied to clipboard
Fused gemvs and trmvs?
BLIS often refers to fused level 1 BLAS-like operations, but I have not seen any fused level 2 operations (e.g., a single-sweep y := A x and u := A^T v). Is there a plan to support such kernels?
We have no plans at this time. But I say that literally.
You may or may not have already noticed this, but some of these compound ("fused") level-2 operations, such as the one you use as an example, can be implemented using the existing level-1f kernels. So the foundation is there; we just need to write the loops around it and integrate into BLIS.
I would find it very useful if there was an example of creating one or two of the fused kernels in BLIS (e.g., fused A^T x and A y for potentially non-square A), as it would allow one to generalize the techniques without having to dig in and grok all of BLIS.