torchtitan
torchtitan copied to clipboard
PP hangs when pipeline_parallel_microbatches < pipeline_parallel_degree
Pipeline parallelism seem to hang when the number of microbatches is less than the degree. This issue occurs for both the standard and interleaved 1F1B schedules. Have not tested other schedules.