Gym icon indicating copy to clipboard operation
Gym copied to clipboard

feat: TRL Integration

Open bxyu-nvidia opened this issue 1 month ago • 1 comments

Use cases, pain points, and background Why should we do this? Why is this needed or wanted?

Description: What should we do?

Design: What files should be touched? What logic should be written?

Out of scope: What are some items that this issue could be mistaken to cover that this issue should explicitly NOT cover?

Acceptance Criteria:

  • [ ] Individual items that need to be finished in order for this issue to be considered completed

bxyu-nvidia avatar Nov 21 '25 17:11 bxyu-nvidia

TRL has a custom rollout function and vllm server mode that makes the integration easier. The vllm server is not a typical AsyncLLMEngine, it does not have openai chat completions/responses endpoints currently, we may have to add support for that https://github.com/huggingface/trl/issues/4602.

cmunley1 avatar Dec 02 '25 18:12 cmunley1