kvcached
kvcached copied to clipboard
[TODO] Add an example of vLLM semantics router?
Can we add an exmple to demonstrate kvcached with vLLM semantics router? https://vllm-semantic-router.com/
We can run multiple models on one GPU for the router to choose, including the sleep and wakeup functionality based on the traffic monitoring status.