how-to-optimize-gemm icon indicating copy to clipboard operation
how-to-optimize-gemm copied to clipboard

Add missing loop over block columns of C

Open angsch opened this issue 2 years ago • 0 comments

The block column width of C is defined by #define nb 1000 and used to allocate the buffers, but otherwise nb is unused. As a result, if the m-by-n matrix C has n > nb, the code encounters a segmentation fault. This commit fixes this issue by

  • adding the missing loop over the block columns of C, and
  • adjusting the default leading dimensions to whatever the row count is.

angsch avatar Jul 29 '23 07:07 angsch