how-to-optimize-gemm icon indicating copy to clipboard operation
how-to-optimize-gemm copied to clipboard

How To Optimize Gemm wiki pages

https://github.com/flame/how-to-optimize-gemm/wiki

Copyright by Prof. Robert van de Geijn ([email protected]).

Adapted to Github Markdown Wiki by Jianyu Huang ([email protected]).

Table of contents

  • The GotoBLAS/BLIS Approach to Optimizing Matrix-Matrix Multiplication - Step-by-Step
  • NOTICE ON ACADEMIC HONESTY
  • References
  • Set Up
  • Step-by-step optimizations
  • Computing four elements of C at a time
    • Hiding computation in a subroutine
    • Computing four elements at a time
    • Further optimizing
  • Computing a 4 x 4 block of C at a time
    • Repeating the same optimizations
    • Further optimizing
    • Blocking to maintain performance
    • Packing into contiguous memory
  • Acknowledgement

Related Links

Acknowledgement

This material was partially sponsored by grants from the National Science Foundation (Awards ACI-1148125/1340293 and ACI-1550493).

Any opinions, findings and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation (NSF).