[Question] Use only RDMA non low-latency

Open ybenvidia opened this issue 5 months ago • 5 comments

Is there a way to force dispatch to use only RDMA even for intranode in non low-latency mode ? If yes, can. I do it in runtime (using for example an env variable) Thanks

Jul 28 '25 19:07 ybenvidia

It's not supported.

Jul 29 '25 08:07 sphish

What if I put NUM_MAX_NVL_PEERS = 1 ?

Jul 29 '25 09:07 ybenvidia

You can give it a try, but our code has a limit on the num_rdma_ranks. So after you set NUM_MAX_NVL_PEERS = 1, the applicable cases will be quite limited.

Jul 30 '25 01:07 sphish

Another question, from what I saw in the code we can only use your code with 20 nodes max. Is it correct ? What if we want to use more than 20 nodes ? The performance will not be good ?

Aug 11 '25 12:08 ybenvidia

@ybenvidia hi, hope its not too late. We implement a kernel to support RDMA only in normal mode, u can refer to https://github.com/deepseek-ai/DeepEP/pull/375, already merged into hybrid-ep branch :)

Nov 14 '25 07:11 MengYu10151