Ada Lovelace support
Hi team, Thank you for your excellent work, I wonder if this repo could support Ada Lovelace architecture such as L20 GPU.
Thanks
First of all, I'm not a member of the team.
In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.
Could you please confirm whether the Ada Lovelace architecture GPUs support GPU Direct RDMA (GDR) and GPU Direct Async (IBGDA)? If so, DeepEP should also be able to run on this architecture.
First of all, I'm not a member of the team.
In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.
Can not work. NVSHMEM does not rely on NVLink. I've tried it on one node with 8 L20 cards. It just won't run successfully. After running for a while, it will report an error. It seems that a certain kernel execution has gone wrong. Can Lyric Zhao give me some hints?
First of all, I'm not a member of the team.
In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.
Can not work. NVSHMEM does not rely on NVLink. I've tried it on one node with 8 L20 cards. It just won't run successfully. After running for a while, it will report an error. It seems that a certain kernel execution has gone wrong. Can Lyric Zhao give me some hints?
Hi,I wonder have you successfully deployed deepep on L20?
First of all, I'm not a member of the team.
In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.
Can not work. NVSHMEM does not rely on NVLink. I've tried it on one node with 8 L20 cards. It just won't run successfully. After running for a while, it will report an error. It seems that a certain kernel execution has gone wrong. Can Lyric Zhao give me some hints?
Hi,I wonder have you successfully deployed deepep on L20? Hi, @Xiaofei-fei I met the same issue as u, have you deployed it successfully?
First of all, I'm not a member of the team.
In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.
Can not work. NVSHMEM does not rely on NVLink. I've tried it on one node with 8 L20 cards. It just won't run successfully. After running for a while, it will report an error. It seems that a certain kernel execution has gone wrong. Can Lyric Zhao give me some hints?
Hi,I wonder have you successfully deployed deepep on L20? Hi, @Xiaofei-fei I met the same issue as u, have you deployed it successfully?
We have resolved most of the issues in intranode mode and can now run together with sglang, but some problems are still being worked on.
First of all, I'm not a member of the team.
In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.
Can not work. NVSHMEM does not rely on NVLink. I've tried it on one node with 8 L20 cards. It just won't run successfully. After running for a while, it will report an error. It seems that a certain kernel execution has gone wrong. Can Lyric Zhao give me some hints?
Hi,I wonder have you successfully deployed deepep on L20? Hi, @Xiaofei-fei I met the same issue as u, have you deployed it successfully?
btw,I noticed your technical talk on deploying DeepEP on PCIe GPUs, and I am very interested in the idea of merging low-latency and normal-related kernels. Could you please provide a contact so that I can discuss the technical details further?
First of all, I'm not a member of the team.
In my understanding, as long as you have cluster environments with RDMA (usually IB NICs and the corresponding software environment ), NVLink between GPUs, and those environments meet the NVSHMEM requirements, it may be usable.
Can not work. NVSHMEM does not rely on NVLink. I've tried it on one node with 8 L20 cards. It just won't run successfully. After running for a while, it will report an error. It seems that a certain kernel execution has gone wrong. Can Lyric Zhao give me some hints?
Hi,I wonder have you successfully deployed deepep on L20? Hi, @Xiaofei-fei I met the same issue as u, have you deployed it successfully?
btw,I noticed your technical talk on deploying DeepEP on PCIe GPUs, and I am very interested in the idea of merging low-latency and normal-related kernels. Could you please provide a contact so that I can discuss the technical details further?
Really appreciate for your attention to our work!Actually we‘ve already submit a PR to support normal mode w/o NVL https://github.com/deepseek-ai/DeepEP/pull/375 ,and you can contact me via wechat misty10151,thx:)