efficient-llm-inference topic

List efficient-llm-inference repositories

Consistency_LLM

348
Stars
17
Forks
Watchers

[ICML 2024] CLLMs: Consistency Large Language Models

Context-Memory

49
Stars
1
Forks
Watchers

Pytorch implementation for "Compressed Context Memory For Online Language Model Interaction" (ICLR'24)