bxyu-nvidia issues

Results 40 issues of


                                            bxyu-nvidia

feat: Sanity check config datasets in nightly tests

**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: What should we do? **Design**: What files should be touched? What logic...

Pipeclean NeMo RL training with all environments

**Use cases, pain points, and background** **Description**: **Design**: **Out of scope**: **Acceptance Criteria**: - [ ] All training environments must be trainable easily with an instruct and thinking model -...

Docs + Environment pattern: Responses-native models

**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: What should we do? **Design**: What files should be touched? What logic...

feat: Support Responses-native vLLM models

**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: What should we do? **Design**: What files should be touched? What logic...

Docs + Environment pattern: Integrate existing Agents

**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: Just use OpenAI Agents SDK which uses OpenAI Responses schema which is...

Docs + Environment pattern: Multi-node Docker instances

**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: What should we do? **Design**: What files should be touched? What logic...

feat: TRL Integration

**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: What should we do? **Design**: What files should be touched? What logic...

training-fw-integration

feat: Reward model support

**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: What should we do? **Design**: What files should be touched? What logic...

feat: VLLMModel has option to spin up local vllm

**Use cases, pain points, and background** Right now users have to point to some external model endpoint. We should enable users to spin up their own. **Description**: Add an option...

model-server

Docs + Environment pattern: RLHF

**Use cases, pain points, and background** **Description**: **Design**: We probably need to make some generic reward model client that can be shared infra for all RLHF environments. **Out of scope**:...

core-infra