efficient-llm-inference topic

List efficient-llm-inference repositories

Consistency_LLM

348
Stars
17
Forks
Watchers

[ICML 2024] CLLMs: Consistency Large Language Models

Context-Memory

63
Stars
2
Forks
63
Watchers

Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)