nannaer comments

Results 30 comments of


                                            nannaer

EPLB

Thank you very much for your contributions to EPLB! I would like to ask you a question. Does the current main branch support "Support changing locations of experts when server...

EPLB

> > Support changing locations of experts when server is running > > Sure, `--enable-eplb` > > Support changing locations of experts when server is running > > Sure, `--enable-eplb`...

EPLB

> start from EPLBManager and the logic is pretty easy to read Thanks!

EPLB

> @tianhaoz95 Hi > > > since only redundant experts change during the rebalance > > almost all (at least most) experts change indeed Hi expert, take DeepSeek V3 as...

Where do dispatch and combine need to be synchronized?

I tried to find where the synchronization is implemented by looking at the code, but I still don't fully understand. Your guidance would be of great help to me! Thanks...

Where do dispatch and combine need to be synchronized?

> > Is synchronization across all ranks needed before dispatching SEND/RECV operations? > > Is synchronization across all ranks needed after dispatching SEND/RECV operations? > > Is synchronization across all...

Where do dispatch and combine need to be synchronized?

> > What changes will occur in the end-to-end latency of each RANK? Can it be estimated as max(Dispatch latency) + Expert Group Gemm latency + max(Combine latency)? > >...

Where do dispatch and combine need to be synchronized?

> For example, the only wait-data-arrival of dispatch is here: https://github.com/deepseek-ai/DeepEP/blob/main/csrc/kernels/internode_ll.cu#L492. How does a RANK know how many inputs it should receive from other RANKs? Does this require an operation...

How to use test_low_latency to profile the latency with different batch sizes on different RANKs?

> [DeepEP/deep_ep/buffer.py](https://github.com/deepseek-ai/DeepEP/blob/483f00af8490b0cc378823c6adecf9ea67602071/deep_ep/buffer.py#L84) > > Line 84 in [483f00a](/deepseek-ai/DeepEP/commit/483f00af8490b0cc378823c6adecf9ea67602071) > > os.environ['NVSHMEM_QP_DEPTH'] = '1024' > > Can you try setting this to a larger number, like 4096? thanks!