streamingllm topic
List
streamingllm repositories
intel-extension-for-transformers
2.2k
Stars
216
Forks
2.2k
Watchers
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Awesome-LLM-Inference
4.9k
Stars
330
Forks
4.9k
Watchers
📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉