luminal
luminal copied to clipboard
Deep learning at the speed of light.
We should have a GH action to benchmark performance and compare against PyTorch numbers to find any regressions
Currently the elementwise fusion is very conservative in what it fuses. It can be a lot more aggressive by: - Fusing constants into kernels - Fusing across shape changes and...
Hi, great project! I'd like to add a new backend/compiler. Is there a step-by-step guide for this to make sure I don't forget anything?
https://arxiv.org/pdf/2001.03288.pdf
Hi @jafoti, nice project! I did something somewhat similar a few years ago in Scala. I skimmed a little bit through the project, so I only have a superficial understanding....
Good morning(or afternoon/ evening)! There is a methodology called **self speculative decoding** among the techniques to enhance the speed of LLM inference. Would it be possible to implement this feature...