sglang icon indicating copy to clipboard operation
sglang copied to clipboard

Deploy multi-node inference (LWS method) using sglang in a K8s cluster

Open whybeyoung opened this issue 10 months ago • 0 comments

Motivation

  1. The community lacks a good example of distributed inference in a K8s environment.
  2. The community lacks examples of containerized environments combined with high-speed networks like RoCE.
  3. K8s is the most popular open-source infrastructure platform, but it lacks best practices for integration with sglang, which is one of the most popular recent open-source inference projects.

whybeyoung avatar Feb 17 '25 06:02 whybeyoung