OpenBLAS icon indicating copy to clipboard operation
OpenBLAS copied to clipboard

[WIP] forward GEMM workloads to GEMV when one argument is actually a vector

Open martin-frbg opened this issue 1 year ago • 4 comments

fixes #4580 and fixes #528

martin-frbg avatar May 20 '24 20:05 martin-frbg

obviously I don't really intend to kick out the recent Loongson patch here - this first draft was thrown together off-grid in an outdated fork

martin-frbg avatar May 20 '24 20:05 martin-frbg

CodSpeed Performance Report

Merging #4708 will not alter performance

Comparing martin-frbg:issue4580 (c2a9b19) with develop (700ea74)

Summary

✅ 16 untouched benchmarks

codspeed-hq[bot] avatar May 20 '24 21:05 codspeed-hq[bot]

@martin-frbg , Thank you for the PR. Just to be sure on sending data to GEMV: For example, when A is matrix 1xn and B is of nxk, then are we flattening A ( i.e to convert matrix to vector) to make it compatible with GEMV.

akote123 avatar Jun 12 '24 07:06 akote123

@martin-frbg , Thank you for the PR. Just to be sure on sending data to GEMV: For example, when A is matrix 1xn and B is of nxk, then are we flattening A ( i.e to convert matrix to vector) to make it compatible with GEMV.

Yes in principle, but I am not convinced we actually have to transform the storage of A for that. (Note that the rough draft I posted here may not even compile. I need to update it and flesh it out when I have time)

martin-frbg avatar Jun 12 '24 08:06 martin-frbg

Thanks... I have an equally unfinished newer version lying around somewhere but got caught up in other things. Let me get the fixups for the SCAL fallout out of the way... but if anybody beats me to it on this here topic it's fine of course. (probably need to remove this from the 0.3.28 milestone anyway so that the release does not get delayed all summer)

martin-frbg avatar Jul 11 '24 18:07 martin-frbg

@martin-frbg This is an important PR since some project cases have a large portion of GEMM in which N=1. In these cases we are spending significant time packing buffer(s) which is not necessarily needed if GEMV was called instead.

ChipKerchner avatar Jul 11 '24 20:07 ChipKerchner

I am aware of that, but this has been an important issue for roughly 20 years (i.e. since inception of GotoBLAS), last discussed here sometime in 2015/16 IIRC. We're almost 2 weeks past the tentative release date for 0.3.28, it bundles an excessive number of changes already, and I still need to come up with assembly code fixes for the SCAL issue in a number of architectures (where assembly isn't my strongest skill anyway).

martin-frbg avatar Jul 11 '24 20:07 martin-frbg

@martin-frbg I took a look into this in #4814

Mousius avatar Jul 23 '24 22:07 Mousius

closing as superseded by #4814

martin-frbg avatar Aug 03 '24 13:08 martin-frbg