pjin-nvidia issues

Results 12 issues of


                                            pjin-nvidia

feat: RL sampler [WIP]

# What does this PR do ? **Add a one line overview of what this PR aims to accomplish.** # Issues List issues that this PR closes ([syntax](https://docs.github.com/en/issues/tracking-your-work-with-issues/using-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)): # Usage...

pjin/nemotron-ray-dev-20251208 vs main

[DO NOT MERGE] pjin/misc-infra-log-fix

Ray utils

This PR adds support for spinning up GPU Ray actors, including and especially VLLM instances, from within Gym servers. Issue resolved: - https://github.com/NVIDIA-NeMo/Gym/issues/282 PR merge dependency: - https://github.com/NVIDIA-NeMo/Gym/pull/317 Related: -...

Servers with pyproject.toml should also have a corresponding uv.lock

For example, the VLLM responses API model server has a pyproject.toml, but no uv.lock: https://github.com/NVIDIA-NeMo/Gym/tree/main/responses_api_models/vllm_model Without a uv.lock, server dependencies can get silently upgraded on server venv setup (e.g. vllm_model...

Server logging via stdout/stderr redirection

**Describe the bug** Currently, server logs are not propagated/displayed during normal operation or upon server setup failure. We should redirect stdout and stderr for servers started the normal way (Q:...

pjin/hross/mt-verifiers-main-ray-dev-20251208 vs main

Rollout collection with cached rows

This allows for easily restarting a rollout collection session without having to re-collect rollouts for already cached rows in the jsonl output file.

pjin/hross/mt-verifiers-dev-20251202

Server response errors/exceptions not propagating to the top-level client

**Describe the bug** For example, suppose that during a NeMo RL + Gym run, an LLM-as-judge resources server is making calls to a judge model remote endpoint. Now, this remote...