[Dist KV] vllm pods which do not have kvcache pods running in the same node crashes.
🚀 Feature Description and Motivation
vllm pods which do not have kvcache pods running in the same node crashes.
All vllm pods should run with kvcache pod in the same node.
Temporary solution would be making kvcache pods spread in all nodes using affinity and antiaffinity. but it is not too unreliable. More elegant and reliable solution is needed.
Use Case
distributed kv cache set up
Proposed Solution
No response
vllm pods which do not have kvcache pods running in the same node crashes.
If the node with engine pods doesn't have cache pod, engine pod will crash. affinity is one problem.
there's another issue I am a little bit concerned, right each cache pod mount a kv specific path instead of kv-instance level path. that means one node can only have one cache pod scheduled. it could be a problem as well.
Let's gradually improve it to more reliable status.