Lifu Zhang

Results 4 issues of Lifu Zhang

# What does this PR do ? This PR adds context parallel support for packed dataset in THD format in NeMo in response to this TE PR: https://github.com/NVIDIA/TransformerEngine/pull/641. Currently, the...

NLP

# What does this PR do ? This PR adds CP support for THD format and is compatible with cu_seqlen_padded in the latest CUDNN fused attention. **PR Type**: - [x]...

NLP

# What does this PR do ? This PR adds a fix that allows HSDP device mesh to be registered in EP submeshes. :warning: For major changes (either in lines...

Expert Review

# What does this PR do ? This PR adds fix for precision-aware optimizer for DeepSeek V3. :warning: For major changes (either in lines of code or in its impact),...

Final Review
dev branch