streamingllm topic

List streamingllm repositories

intel-extension-for-transformers

2.2k
Stars
216
Forks
2.2k
Watchers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Awesome-LLM-Inference

4.9k
Stars
330
Forks
4.9k
Watchers

📚A curated list of Awesome LLM/VLM Inference Papers with Codes: Flash-Attention, Paged-Attention, WINT8/4, Parallelism, etc.🎉