jeffye-dev issues

Results 3 issues of


                                            jeffye-dev

How to achieve 253 tok/sec with DeepSeek-R1-FP4 on 8xB200

I want to reproduce the DeepSeek-R1-FP4 on B200 deployment solution to align with the blog : https://developer.nvidia.com/blog/nvidia-blackwell-delivers-world-record-deepseek-r1-inference-performance However, I just get 40 output tokens per per user, comparing with the...

triaged

[test_internode.py] failed on multi-QP: dispatch timeout on ROCE network with testing 2*H20 nodes

When I run the across-node test with `MASTER_ADDR= MASTER_PORT=30001 WORLD_SIZE=2 RANK=0 python test_internode.py` on 2*H20 nodes, I got the following timeout log: ``` DeepEP dispatch NVL receiver timeout, channel: 7,...

UE8M0(PR206) features cause severe a regression issue and cause low-latency stuck

In recent SGLANG PD disaggregation integration tests, we found it 100% stuck in DeepEP dispatch-combine call. And the low_latency.py unit test stack looks as below when it's stuck: > __torch_function__...