Chris Wing issues

Results 23 issues of


                                            Chris Wing

ray.sub requires submission from NeMo RL home directory, blocking external workflow organization

**Describe the bug** The ray.sub script currently requires users to submit SLURM jobs from the NeMo RL repository directory, which prevents users from organizing their training scripts and experiments in...

bug

Docs: Multi-Step Patterns

## Document Multi-Step Patterns ### Context Users have asked how to handle multi-step agentic tasks where the agent makes multiple tool calls within a single trajectory. This requires understanding: -...

external

documentation

Docs + Environment pattern: Multi-Turn Training Pattern

### Background Users want to train models on multi-turn conversational tasks where the agent handles back-and-forth interactions with a user ### Problem Users need guidance on: - What "multi-turn" means...

external

documentation

Docs + Environment pattern: Modeling User Using LLM During Multi-Turn Training

### Background Often during multi-turn conversational training, users need to simulate realistic user responses during rollout collection. ### Problem Users need guidance on: - When to use LLM-based user simulation...

resource-server

external

documentation

Docs + Environment pattern: How to Incorporate LLM as a Judge in Verification

## Tutorial: How to Incorporate LLM as a Judge in Verification Logic ### Background Users have asked how to use LLM-as-a-judge for verification in their resource servers. This is particularly...

resource-server

external

documentation

Add Architecture Diagram and Clarify NeMo Gym + NeMo RL Integration

## Add Architecture Diagram and Clarify NeMo Gym + NeMo RL Integration ### Background Users were unclear on the data flow between NeMo Gym, NeMo RL, and the policy model....

external

documentation

Integrate with Prime Intellect verifiers

**Use cases, pain points, and background** While they differ in their architectural approach, both frameworks target the same goal: enabling Reinforcement Learning from Verifiable Reward (RLVR) at scale. **Description**: Goal:...

resource-server

Workplace assistant has references to workbench

**Describe the bug** A user noted confusion because of the column `environment_name` has value `workbench` but the resource server in NeMo Gym is named `workplace_assistant`. The user was also unclear...

resource-server

data

Incorrect dataset reference for workplace assistant data pre-processing

**Describe the bug** Workplace assistant `dataset_preprocess.py` has an incorrect dataset reference in `get_samples` function **Steps/Code to reproduce bug** This file doesn't exist https://github.com/NVIDIA-NeMo/Gym/blob/287d08d4ccba3c146e44616ffa1cf3b9ddaab92b/resources_servers/workplace_assistant/dataset_preprocess.py#L87 dataset = load_dataset("Nexusflow/250319-workplace_assistant-fulleval", split=split) **Expected behavior** Use...

external

data

Add GitHub badges to README

Badges should be added to the top of the README - License - CI/CD - python version - release version

repo-tooling