Baibaifan

Results 26 comments of Baibaifan

> This is a bug in NVSHMEM 3.1.7 and can be resolved by using NVSHMEM 3.2.5. [#17 (comment)](https://github.com/deepseek-ai/DeepEP/issues/17#issuecomment-2684327121) How to deal with conflicts in deepep patch packages?

> > > This is a bug in NVSHMEM 3.1.7 and can be resolved by using NVSHMEM 3.2.5. [#17 (comment)](https://github.com/deepseek-ai/DeepEP/issues/17#issuecomment-2684327121) > > > > > > How to deal with...

> > I have created a new branch that updates NVSHMEM to version 3.2.5. However, I don't have a RoCE environment for verification. Can you test this out? [@Baibaifan](https://github.com/Baibaifan) Branch:...

> > > > I have created a new branch that updates NVSHMEM to version 3.2.5. However, I don't have a RoCE environment for verification. Can you test this out?...

> > > > I have created a new branch that updates NVSHMEM to version 3.2.5. However, I don't have a RoCE environment for verification. Can you test this out?...

> > Hi [@sphish](https://github.com/sphish), The process works, but the performance does not seem to meet expectations. > > env: > > > > 1. H100 80GB HBM3 *8/HPC > >...

> NCCL_DEBUG=INFO MASTER_ADDR=xxx WORLD_SIZE=2 RANK=0 python tests/test_internode.py ![Image](https://github.com/user-attachments/assets/c2c228b8-8f5b-4b1a-8843-d241823c9f90) After modifying the wrong `OOB` configuration, the current speed of the 4 network NICs is: ``` [tuning] Best dispatch (FP8): SMs 24,...

> perftest Could you please send me the command to run `perftest` for reference? @kunfupanda-hw

> > > > [@Baibaifan](https://github.com/Baibaifan) What is `OOB` configuration? Regarding the performance issue, I agree. It appears that the bandwidth is limited by the NICs. > > > > >...

> ib_write_bw I mean the tests in `perftest/perftest_install`, in the `nvshmem_src` directory. @kunfupanda-hw