Baibaifan
Baibaifan
> This is a bug in NVSHMEM 3.1.7 and can be resolved by using NVSHMEM 3.2.5. [#17 (comment)](https://github.com/deepseek-ai/DeepEP/issues/17#issuecomment-2684327121) How to deal with conflicts in deepep patch packages?
> > > This is a bug in NVSHMEM 3.1.7 and can be resolved by using NVSHMEM 3.2.5. [#17 (comment)](https://github.com/deepseek-ai/DeepEP/issues/17#issuecomment-2684327121) > > > > > > How to deal with...
> > I have created a new branch that updates NVSHMEM to version 3.2.5. However, I don't have a RoCE environment for verification. Can you test this out? [@Baibaifan](https://github.com/Baibaifan) Branch:...
> > > > I have created a new branch that updates NVSHMEM to version 3.2.5. However, I don't have a RoCE environment for verification. Can you test this out?...
> > > > I have created a new branch that updates NVSHMEM to version 3.2.5. However, I don't have a RoCE environment for verification. Can you test this out?...
> > Hi [@sphish](https://github.com/sphish), The process works, but the performance does not seem to meet expectations. > > env: > > > > 1. H100 80GB HBM3 *8/HPC > >...
> NCCL_DEBUG=INFO MASTER_ADDR=xxx WORLD_SIZE=2 RANK=0 python tests/test_internode.py  After modifying the wrong `OOB` configuration, the current speed of the 4 network NICs is: ``` [tuning] Best dispatch (FP8): SMs 24,...
> perftest Could you please send me the command to run `perftest` for reference? @kunfupanda-hw
> > > > [@Baibaifan](https://github.com/Baibaifan) What is `OOB` configuration? Regarding the performance issue, I agree. It appears that the bandwidth is limited by the NICs. > > > > >...
> ib_write_bw I mean the tests in `perftest/perftest_install`, in the `nvshmem_src` directory. @kunfupanda-hw