chaoyang

Results 3 issues of chaoyang

### Checklist - [ ] 1. I have searched related issues but cannot get the expected help. - [ ] 2. The bug has not been fixed in the latest...

![image](https://github.com/SafeAILab/EAGLE/assets/30709861/a17adc6b-f47c-499e-9dec-97ef037c891f)

**Your question** Ask a clear and concise question about Megatron-LM. When `calculate_per_token_loss` is enabled, `finalize_model_grads` scales the gradients according to the num tokens(total number of a iter). However, I observed...

community-request