min-xu-et
min-xu-et
@gkiri can you also provide the vllm command line for your test?
I have seen this error (gloo mesh connection failed) with vllm too. I think it is related to your network setup. I wasn't able to find a solution other than...
In your case, perhaps gloo only needs eth0 since my understanding is that gloo is only used for some low bandwidth coordination between the nodes using a CPU process group...
I agree it is a low priority. There are a lot of way to generate diverse output from LLMs and then select a good response. Unclear that beam search is...
I don't know why is this a sglang issue. This seems to be related to the context window length of a given model
Are there any data related to inference time batch size and token imbalance between experts? What's the total throughput like for a 8xH200 node?