DXZDXZ

Results 1 issues of DXZDXZ

**Describe the bug** When using Megatron-Core v0.9.0 with CUDA Graphs enabled, NaN gradients are encountered during the backward computation. This issue does not occur when CUDA Graphs are disabled. **To...

stale