Zijie Yan

Results 58 comments of Zijie Yan

Thanks! Let us take a deeper look @Victarry

Merged in https://github.com/NVIDIA/Megatron-LM/commit/afb755f548b48151a4408b0e9caf674b8349b589 Many thanks to the Infrastructure Center of Tencent WeChat's Technical Architecture Department(微信技术架构部-基础架构中心) for their contributions. cc @shifangx

Thanks for reporting the issue! I suspect the discrepancy is due to the different accumulation orders of reduction during token combination. We've received feedback from other customers suggesting that reduction...

> Hi [@yanring](https://github.com/yanring) , after manually setting the probs to fp64, I still have precision issues with EP. Do you have any suggestion on what else need to be promoted...

LGTM, left a few comments. Will trigger merge after resolving them.

Thanks for reporting and fixing it. Unfortunately, we can't directly merge this PR on Github, but we'll include the fix in the next version.

@yiakwy-xpu-ml-framework-team Hey, thanks for the PR. Since this is a follow-up PR, I'll temporarily make it a draft.

This PR introduces significant logic changes but no corresponding unit tests. Please consider add tests for _get_sample_arguments()

/ok to test efb8f5dc4fe7ee0d7aec6d8a546fb82182430489