ray_vllm_inference
ray_vllm_inference copied to clipboard
How can we do the same, using KubeRay, RayCluster and RayServe?
So i've got a kubernetes cluster, installed with KubeRay Operator and created a RayCluster, but how do i then create a manifest for Rayserve to serve, say a Llama2 or Mistral 7b? appreciate your help and thank you in advance!