efficient-llm-inference topic

List efficient-llm-inference repositories

Consistency_LLM

348
Stars
17
Forks
Watchers

[ICML 2024] CLLMs: Consistency Large Language Models

Context-Memory

62
Stars
3
Forks
62
Watchers

Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)