[Question] Use only RDMA non low-latency
Is there a way to force dispatch to use only RDMA even for intranode in non low-latency mode ? If yes, can. I do it in runtime (using for example an env variable) Thanks
It's not supported.
What if I put NUM_MAX_NVL_PEERS = 1 ?
You can give it a try, but our code has a limit on the num_rdma_ranks. So after you set NUM_MAX_NVL_PEERS = 1, the applicable cases will be quite limited.
Another question, from what I saw in the code we can only use your code with 20 nodes max. Is it correct ? What if we want to use more than 20 nodes ? The performance will not be good ?
@ybenvidia hi, hope its not too late. We implement a kernel to support RDMA only in normal mode, u can refer to https://github.com/deepseek-ai/DeepEP/pull/375, already merged into hybrid-ep branch :)