bxyu-nvidia
bxyu-nvidia
**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: What should we do? **Design**: What files should be touched? What logic...
**Use cases, pain points, and background** **Description**: **Design**: **Out of scope**: **Acceptance Criteria**: - [ ] All training environments must be trainable easily with an instruct and thinking model -...
**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: What should we do? **Design**: What files should be touched? What logic...
**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: What should we do? **Design**: What files should be touched? What logic...
**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: Just use OpenAI Agents SDK which uses OpenAI Responses schema which is...
**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: What should we do? **Design**: What files should be touched? What logic...
**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: What should we do? **Design**: What files should be touched? What logic...
**Use cases, pain points, and background** Why should we do this? Why is this needed or wanted? **Description**: What should we do? **Design**: What files should be touched? What logic...
**Use cases, pain points, and background** Right now users have to point to some external model endpoint. We should enable users to spin up their own. **Description**: Add an option...
**Use cases, pain points, and background** **Description**: **Design**: We probably need to make some generic reward model client that can be shared infra for all RLHF environments. **Out of scope**:...