Megatron-LM
Megatron-LM copied to clipboard
[BUG] Problem in run_text_generation_server.py
https://github.com/NVIDIA/Megatron-LM/blob/8416bff56a06771834be00327e1523aff4b20f88/tools/run_text_generation_server.py#L228 If TP=1, PP=1, and EP=N, multiple ranks will use the same port. The condition should be:
if mpu.is_pipeline_first_stage() and mpu.get_tensor_model_parallel_rank() == 0 and mpu.get_expert_model_parallel_rank(): == 0
Besides, I'm curious—does TP=1, PP=1, and EP=N refer to DP (non-MoE) with EP (MoE)?