vllm
vllm copied to clipboard
[Doc]: i want to know. How to run vllms with remote ray cluster
📚 The doc issue
i want to know. How to run vllms with remote ray cluster
my code is from llama_index.llms.vllm import Vllm import ray ray.init(address="ray://10.0.233.89:10001") llm = Vllm(model="./Mistral-7B-Instruct-v0.1",dtype="float16",tensor_parallel_size=2, temperature=0,max_new_tokens=100,vllm_kwargs={"swap_space": 1,"gpu_memory_utilization": 0.5,"max_model_len": 4096,})
Suggest a potential alternative/fix
No response