Gym icon indicating copy to clipboard operation
Gym copied to clipboard

Pipeclean NeMo RL training with all environments

Open bxyu-nvidia opened this issue 2 weeks ago • 2 comments

Use cases, pain points, and background

Description:

Design:

Out of scope:

Acceptance Criteria:

  • [ ] All training environments must be trainable easily with an instruct and thinking model
  • [ ] Any fixes that we need to do along the way

bxyu-nvidia avatar Dec 12 '25 20:12 bxyu-nvidia

Please also check the Huggingface datasets themselves on HF Hub. We also want to fix issues like those in the screenshot below https://huggingface.co/datasets/nvidia/Nemotron-RL-math-OpenMathReasoning

Image

I believe the fix for this particular issue is that we need to rename the filename from open_math_reasoning_problems.jsonl to train.jsonl and it will be picked up

bxyu-nvidia avatar Dec 16 '25 00:12 bxyu-nvidia

For QA, please ensure that all the rows in the HF dataset match what is expected

bxyu-nvidia avatar Dec 16 '25 00:12 bxyu-nvidia