guyueh1

Results 20 issues of guyueh1

**Is your feature request related to a problem? Please describe.** Currently async_save is [disabled](https://github.com/NVIDIA-NeMo/RL/blame/8762f575c0d11aeb8a073a64e49cf433eb77c94a/nemo_rl/models/policy/megatron_policy_worker.py#L654) in mcore path checkpoint, serialization takes a long time with training paused; should test async_save and...

enhancement
Performance
research
mcore
checkpointing

When number of nodes is >=32 and calling `policy.train`, on some ranks it takes a long time to perform the initial synchronization (AllReduce nccl kernel) but on some other ranks...

Performance
mcore

Tracking items to improve performance for long max-seqlen workloads, including training and generation performance at long context.

Performance

**Is your feature request related to a problem? Please describe.** Right now nemo-rl logs the mean and max generated tokens per sample every step, but those two metrics cannot fully...

enhancement

**Is your feature request related to a problem? Please describe.** Right now in nemo-rl GRPO, the generation workers will return the token_ids to header node on cpu, and header node...

enhancement
Performance

Issue to track low-precision GRPO recipe testing and perf optimization

Low Precision
B200
GB200

**Is your feature request related to a problem? Please describe.** Release code for FP8 rollout + FP8 training (blockwise scaling) * Convergence test and downstream evaluation compared w/ BF16 for...

inference
mcore
Low Precision

# What does this PR do ? Random dataset following specified input and output sequence length # Issues closes #1302 # Usage Use the following flags for fixed ISL/OSL eval...

CI:L0

**Is your feature request related to a problem? Please describe.** Change the performance test script to use DAPO algo and Math17k dataset.

enhancement

Tracking v0.5 items for MoE performance, example model is deepseek v3.

Performance