Haixin Nan

Results 3 issues of Haixin Nan

The original Linux cosmic repo `git://kernel.ubuntu.com/ubuntu/ubuntu-cosmic.git` is no longer available. This submission modifies it to a new source [https://git.launchpad.net/ ~Ubuntu kernel/ubuntu/+source/Linux/+git/Cosmic](https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/cosmic) (cherry picked from commit bfb0ef296986a5b9adc3784ba85b361ef85c3763) this commit fixes #24

Hi Megatron-LM team! While going through the code in `megatron/core/pipeline_parallel/schedules.py`, I noticed that between each forward and backward pass, the line `total_num_tokens += num_tokens.item()` uses the `item()` method. https://github.com/NVIDIA/Megatron-LM/blob/8ca9e57f9d0bb93fc61850ebdccb6b6e6fa36b64/megatron/core/pipeline_parallel/schedules.py#L451-L467 From...

stale

Not just for the GPU with global_rank=0, following the code below. https://github.com/NVIDIA/Megatron-LM/blob/28118fcdc22e42621776a021af568ae39c198418/pretrain_gpt.py#L257-L260 https://github.com/NVIDIA/Megatron-LM/blob/28118fcdc22e42621776a021af568ae39c198418/pretrain_gpt.py#L306-L311

stale