Jiaxin Shan

Results 742 comments of Jiaxin Shan

In the documentation, I suggest to use create instead of `apply`. `create` will workaround the issue

seems the key problem is the sglang and vllm torch compatibility is not always aligned.

@ModiCodeCraftsman I’ve reviewed the doc and overall it looks good. Just a few suggestions to ensure full compatibility: - Tenant ID should be optional. The key builder and related logic...

It should be done in cold start manager or some other reusable component.

## Routing ![Image](https://github.com/user-attachments/assets/c4ff2a79-8e5f-4524-8ca1-4f7a141056ba) ![Image](https://github.com/user-attachments/assets/f45c64c2-bf3e-448f-b170-238bd953bf24) always hit the head --- Update: after running more tests. I notice this is not true. I did see it comes to other pods, but due...

## RayCluster Orchestration related 1. ray.io/overwrite-container-cmd -> RayCluster level 2. header & worker annotations has to be set separately, there's no propogation to different roles yet. RayClusterFleet spec.templates.metadata controls RayCluster...

### vLLM 0.7.3 problem ![Image](https://github.com/user-attachments/assets/fad97710-e2ff-45de-8cb8-7bde93d0fc85) hang for long time, I checked https://github.com/vllm-project/vllm/issues/13136 and decide to rebuild the image ``` FROM vllm/vllm-openai:v0.7.3 RUN pip3 install -U ray[default,adag]==2.40.0 --progress-bar off # important...

## RDMA setup From the nccl logs, we can see that cross-node communication is happening over RDMA, while intra-node transfers fall back to IPC (NVLink in this case). ('NCCL INFO...

## Autoscaling ![Image](https://github.com/user-attachments/assets/214e7978-32f5-4054-85b8-1cb9e47aa5c1) ``` NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES deepseek-r1-671b-56f9654bbb-mgdwd-head-lf5xg 1/1 Running 0 27m 192.168.0.74 192.168.0.51 deepseek-r1-671b-56f9654bbb-mgdwd-worker-group-worker-pb4hh 1/1 Running 0 27m 192.168.0.81 192.168.0.52 ```...