blog
blog copied to clipboard
Add: fsa/flash-llm.md
Thank you very much
Hello there to detail this blog, it is work by @Summer-Summer at FSA-Lab and others at Alibaba Research. The source code can be found at https://github.com/AlibabaResearch/flash-llm and https://github.com/usyd-fsalab/flash-llm. This work is a large scale LLM inference library focusing on GPU code optimisations for sparse matrices.
@osanseviero @sayakpaul Let us know if there is anything you need.