Megatron-LM [BUG] Problem in run_text_generation

[BUG] Problem in run_text_generation_server.py

Open paprika0741 opened this issue 5 months ago • 1 comments

https://github.com/NVIDIA/Megatron-LM/blob/8416bff56a06771834be00327e1523aff4b20f88/tools/run_text_generation_server.py#L228 If TP=1, PP=1, and EP=N, multiple ranks will use the same port. The condition should be:

if mpu.is_pipeline_first_stage() and mpu.get_tensor_model_parallel_rank() == 0 and mpu.get_expert_model_parallel_rank(): == 0

Besides, I'm curious—does TP=1, PP=1, and EP=N refer to DP (non-MoE) with EP (MoE)?

May 14 '25 18:05 paprika0741

Megatron-LM Megatron-LM copied to clipboard

[BUG] Problem in run_text_generation_server.py

Megatron-LM
Megatron-LM copied to clipboard