RL fix: Add helpful error message if prepare_for

#1141 Adds checks to train, get_logprobs, and generate in MegatronPolicyWorker to raise a clear error if the model is not prepared for GPU execution.

Please let me know if my approach or fix needs any improvements . I’m open to feedback and happy to make changes based on suggestions. Thankyou !

Summary by CodeRabbit

New Features
- Adds a readiness check requiring explicit preparation before training or inference, improving safety and clarity.
- Provides consistent, user-friendly error messages if operations are attempted before preparation.
Bug Fixes
- Prevents accidental GPU execution before the model is properly prepared, reducing crashes and undefined behavior during training and generation.

Oct 10 '25 06:10 Aniketsy

📝 Walkthrough

Walkthrough

Adds a per-instance readiness flag to MegatronPolicyWorker. Methods train, get_logprobs, and generate now raise RuntimeError if called before preparation. The flag is set during prepare_for_training and prepare_for_lp_inference. This enforces an explicit preparation step before GPU-bound execution.

Changes

Cohort / File(s)	Summary
Readiness gating for GPU execution `nemo_rl/models/policy/megatron_policy_worker.py`	Introduced is_prepared flag (default False). Added guards in train, get_logprobs, and generate to raise RuntimeError if not prepared. Set is_prepared = True in prepare_for_training and prepare_for_lp_inference. Centralized error message for unprepared invocation.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Caller
  participant W as MegatronPolicyWorker

  Note over W: is_prepared defaults to False

  C->>W: prepare_for_training() / prepare_for_lp_inference()
  activate W
  W->>W: set is_prepared = True
  deactivate W

  alt Prepared
    C->>W: train()/get_logprobs()/generate()
    W-->>C: proceed with GPU-bound execution
  else Not prepared
    C->>W: train()/get_logprobs()/generate()
    W-->>C: RuntimeError("Model must be prepared before execution")
  end

  Note over W: Guards enforce explicit preparation before use

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 42.86% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Test Results For Major Changes	✅ Passed	The PR adds readiness guards and a boolean flag to MegatronPolicyWorker to raise clear errors if training/inference methods are called before preparation. This is a behavioral safety check that does not alter numerics, convergence, or steady-state performance when properly prepared, and only adds early-exit errors otherwise. The PR description does not include test results, but given the scope, this qualifies as a minor change under the check’s criteria. Therefore, test results are not required for this PR to pass the check.
Title Check	✅ Passed	The title clearly and concisely captures the primary change—adding an explicit error message when prepare_for_*() is not called—using straightforward language that aligns with the PR’s objective of enforcing preparation steps before GPU execution.

✨ Finishing touches

[ ] 📝 Generate docstrings

🧪 Generate unit tests (beta)

[ ] Create PR with unit tests
[ ] Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Oct 10 '25 06:10 coderabbitai[bot]

@terrykong I've updated the changes as per your suggestions, please let me know if this needs improvement.

Oct 16 '25 10:10 Aniketsy

Hi @Aniketsy . So right now it looks like once you call prepare once the flag is never set back to false, so you can still fall into the situation where it fails and you don't get this nice message because the parameters were offloaded.

Do you see any way to do this without introducing a complex state machine?

Oct 21 '25 05:10 terrykong

@terrykong Sorry, I missed the notification. I’ll look into this again. I’ll also explore if there’s a way to address this without introducing a complex state machine.

Nov 21 '25 09:11 Aniketsy

fix: Add helpful error message if prepare_for_*() not called

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Pre-merge checks and finishing touches