sglang
sglang copied to clipboard
Deploy multi-node inference (LWS method) using sglang in a K8s cluster
Motivation
- The community lacks a good example of distributed inference in a K8s environment.
- The community lacks examples of containerized environments combined with high-speed networks like RoCE.
- K8s is the most popular open-source infrastructure platform, but it lacks best practices for integration with sglang, which is one of the most popular recent open-source inference projects.