Foundation Model Inference

Results 2 repositories owned by


                                            Foundation Model Inference

9.0k

Stars

527

Forks

Watchers

Running large language models on a single GPU for throughput-oriented scenarios.

295

Stars

Forks

Watchers

[NeurIPS'23] H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models.