DeepEP Ada Lovelace support

Hi team, Thank you for your excellent work, I wonder if this repo could support Ada Lovelace architecture such as L20 GPU.

Thanks

Feb 25 '25 03:02 xutizhou

First of all, I'm not a member of the team.

In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.

Feb 25 '25 04:02 NorthSecond

Could you please confirm whether the Ada Lovelace architecture GPUs support GPU Direct RDMA (GDR) and GPU Direct Async (IBGDA)? If so, DeepEP should also be able to run on this architecture.

Feb 25 '25 09:02 haswelliris

First of all, I'm not a member of the team.

In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.

Can not work. NVSHMEM does not rely on NVLink. I've tried it on one node with 8 L20 cards. It just won't run successfully. After running for a while, it will report an error. It seems that a certain kernel execution has gone wrong. Can Lyric Zhao give me some hints?

Mar 11 '25 10:03 wangzhen2271

First of all, I'm not a member of the team.

In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.

Can not work. NVSHMEM does not rely on NVLink. I've tried it on one node with 8 L20 cards. It just won't run successfully. After running for a while, it will report an error. It seems that a certain kernel execution has gone wrong. Can Lyric Zhao give me some hints?

Hi，I wonder have you successfully deployed deepep on L20?

Jun 10 '25 06:06 Xiaofei-fei

First of all, I'm not a member of the team.

In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.

Can not work. NVSHMEM does not rely on NVLink. I've tried it on one node with 8 L20 cards. It just won't run successfully. After running for a while, it will report an error. It seems that a certain kernel execution has gone wrong. Can Lyric Zhao give me some hints?

Hi，I wonder have you successfully deployed deepep on L20? Hi, @Xiaofei-fei I met the same issue as u, have you deployed it successfully?

Aug 19 '25 04:08 MengYu10151

First of all, I'm not a member of the team.

In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.

Can not work. NVSHMEM does not rely on NVLink. I've tried it on one node with 8 L20 cards. It just won't run successfully. After running for a while, it will report an error. It seems that a certain kernel execution has gone wrong. Can Lyric Zhao give me some hints?

Hi，I wonder have you successfully deployed deepep on L20? Hi, @Xiaofei-fei I met the same issue as u, have you deployed it successfully?

We have resolved most of the issues in intranode mode and can now run together with sglang, but some problems are still being worked on.

Aug 19 '25 06:08 Xiaofei-fei

First of all, I'm not a member of the team.

In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.

Can not work. NVSHMEM does not rely on NVLink. I've tried it on one node with 8 L20 cards. It just won't run successfully. After running for a while, it will report an error. It seems that a certain kernel execution has gone wrong. Can Lyric Zhao give me some hints?

Hi，I wonder have you successfully deployed deepep on L20? Hi, @Xiaofei-fei I met the same issue as u, have you deployed it successfully?

btw，I noticed your technical talk on deploying DeepEP on PCIe GPUs, and I am very interested in the idea of merging low-latency and normal-related kernels. Could you please provide a contact so that I can discuss the technical details further?

Sep 08 '25 09:09 Xiaofei-fei

First of all, I'm not a member of the team.

In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.

Can not work. NVSHMEM does not rely on NVLink. I've tried it on one node with 8 L20 cards. It just won't run successfully. After running for a while, it will report an error. It seems that a certain kernel execution has gone wrong. Can Lyric Zhao give me some hints?

Hi，I wonder have you successfully deployed deepep on L20? Hi, @Xiaofei-fei I met the same issue as u, have you deployed it successfully?

btw，I noticed your technical talk on deploying DeepEP on PCIe GPUs, and I am very interested in the idea of merging low-latency and normal-related kernels. Could you please provide a contact so that I can discuss the technical details further?

Really appreciate for your attention to our work！Actually we‘ve already submit a PR to support normal mode w/o NVL https://github.com/deepseek-ai/DeepEP/pull/375 ，and you can contact me via wechat misty10151，thx：）

Sep 08 '25 10:09 MengYu10151