[Example] Add vllm semantic router example
Issue: https://github.com/ovg-project/kvcached/issues/91
This is WIP, a basic working example.
TODO:
- Include the sleep and wakeup functionality based on the traffic monitoring status.
- As vllm semantic router doesn't have a release for now, the patches need to be updated later after they have a release.
Great! I think the sleep and wakeup functionality has already been merged, right? @jiarong0907
Great! I think the sleep and wakeup functionality has already been merged, right? @jiarong0907
I think the traffic monitoring and sleep management are tied with the router. Since we are using the vllm semantic router here instead of the controller/router, sleep management is unsupported. To enable it, we might need to integrate traffic monitoring and sleep management into the semantic router.
@ztang2370 @cui36 Having vllm semantic router is great, but I would suggest we add it as a feature later.
For the example, we can just use the current router we have. The example just needs to show the features of the router and sleeping. This will be the 03_model_router_sleep.
@ztang2370 @cui36 Having vllm semantic router is great, but I would suggest we add it as a feature later.
For the example, we can just use the current router we have. The example just needs to show the features of the router and sleeping. This will be the
03_model_router_sleep.
But my understanding is the features of routing and sleeping are already shown in controller, isn't it?
@ztang2370 @cui36 Having vllm semantic router is great, but I would suggest we add it as a feature later. For the example, we can just use the current router we have. The example just needs to show the features of the router and sleeping. This will be the
03_model_router_sleep.But my understanding is the features of routing and sleeping are already shown in controller, isn't it?
Yes, but we need an end to end example to tell users how this works and can be used.
@ztang2370 @cui36 Having vllm semantic router is great, but I would suggest we add it as a feature later. For the example, we can just use the current router we have. The example just needs to show the features of the router and sleeping. This will be the
03_model_router_sleep.But my understanding is the features of routing and sleeping are already shown in controller, isn't it?
Yes, but we need an end to end example to tell users how this works and can be used.
Oh I see. Will update some info there today.