fastertransformer topic

List fastertransformer repositories

serving-codegen-gptj-triton

20
Stars
0
Forks
Watchers

Serving Example of CodeGen-350M-Mono-GPTJ on Triton Inference Server with Docker and Kubernetes

lmdeploy

2.7k
Stars
243
Forks
Watchers

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.