Qi Penghui

Results 2 issues of Qi Penghui

**Describe the bug** In `megatron/core/pipeline_parallel/schedules.py`, `finish_embedding_wgrad_compute` should appear before `enable_grad_sync` and `grad_sync_func`? **Expected behavior** Gradient all-reduce should happen after gradient computations.