kvcached icon indicating copy to clipboard operation
kvcached copied to clipboard

[Example] Add vllm semantic router example

Open ztang2370 opened this issue 2 months ago • 6 comments

Issue: https://github.com/ovg-project/kvcached/issues/91

This is WIP, a basic working example.

TODO:

  1. Include the sleep and wakeup functionality based on the traffic monitoring status.
  2. As vllm semantic router doesn't have a release for now, the patches need to be updated later after they have a release.

ztang2370 avatar Sep 28 '25 17:09 ztang2370

Great! I think the sleep and wakeup functionality has already been merged, right? @jiarong0907

cui36 avatar Sep 28 '25 22:09 cui36

Great! I think the sleep and wakeup functionality has already been merged, right? @jiarong0907

I think the traffic monitoring and sleep management are tied with the router. Since we are using the vllm semantic router here instead of the controller/router, sleep management is unsupported. To enable it, we might need to integrate traffic monitoring and sleep management into the semantic router.

ztang2370 avatar Sep 29 '25 11:09 ztang2370

@ztang2370 @cui36 Having vllm semantic router is great, but I would suggest we add it as a feature later.

For the example, we can just use the current router we have. The example just needs to show the features of the router and sleeping. This will be the 03_model_router_sleep.

jiarong0907 avatar Sep 30 '25 01:09 jiarong0907

@ztang2370 @cui36 Having vllm semantic router is great, but I would suggest we add it as a feature later.

For the example, we can just use the current router we have. The example just needs to show the features of the router and sleeping. This will be the 03_model_router_sleep.

But my understanding is the features of routing and sleeping are already shown in controller, isn't it?

ztang2370 avatar Sep 30 '25 06:09 ztang2370

@ztang2370 @cui36 Having vllm semantic router is great, but I would suggest we add it as a feature later. For the example, we can just use the current router we have. The example just needs to show the features of the router and sleeping. This will be the 03_model_router_sleep.

But my understanding is the features of routing and sleeping are already shown in controller, isn't it?

Yes, but we need an end to end example to tell users how this works and can be used.

jiarong0907 avatar Sep 30 '25 07:09 jiarong0907

@ztang2370 @cui36 Having vllm semantic router is great, but I would suggest we add it as a feature later. For the example, we can just use the current router we have. The example just needs to show the features of the router and sleeping. This will be the 03_model_router_sleep.

But my understanding is the features of routing and sleeping are already shown in controller, isn't it?

Yes, but we need an end to end example to tell users how this works and can be used.

Oh I see. Will update some info there today.

cui36 avatar Sep 30 '25 15:09 cui36