zero-bubble-pipeline-parallelism icon indicating copy to clipboard operation
zero-bubble-pipeline-parallelism copied to clipboard

Zero Bubble Pipeline Parallelism

Results 24 zero-bubble-pipeline-parallelism issues
Sort by recently updated
recently updated
newest added

I'm curious about how you measured the precise bubble time during a run in your experiments(T_Comm in the paper). Megatron-LM's scheduling combines communication and idle time within the same NCCL...

i test llama2 13b on a800, the pp parallelism is 4 and micro-batch-size = 1 and global-batch-size = 64 the 1f1b log, i just use 1f1b, not use vp iteration...

I SEE zero-bubble-pipeline-parallelism disabled FusdLayerNorm,Is it because of the fused op can not split backward of w and x?