rl icon indicating copy to clipboard operation
rl copied to clipboard

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

Results 254 rl issues
Sort by recently updated
recently updated
newest added

## Motivation Often times we only want to train an algorithm until it learned the intended behavior, and a total number of frames is only a proxy for the stopping...

enhancement

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2358

enhancement
CLA Signed

## Without the primer, the collector does not feed any hidden state to the policy in the [RNN tutorial ](https://github.com/pytorch/rl/blob/main/tutorials/sphinx-tutorials/dqn_with_rnn.py)it is stated that the primer is optional and it is...

bug

- [ ] `break_when_all_done` in `env.rollout()` #2355 - [ ] Partial steps in env #2356 - [ ] `BatchedEnv`: pass the indices of envs where a step should be done...

enhancement

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #2359 * #2358 * __->__ #2354 * #2307 * #2306 * #2305 * #2304

CLA Signed

## Motivation #2355 would be much cleaner if we could do partial steps in batched or stateless envs. ### Design question - Should we index the batched env to make...

enhancement

## Description Add LLM Collector ## Motivation and Context #2872 ## Types of changes What types of changes does your code introduce? Remove all that do not apply: - [...

CLA Signed

## Describe the bug I see very cool advancements in the direction of LLM RL training in the repo, awesome work! :) After playing a bit with the LLMEnv I...

bug

## Motivation We need a [collector](https://pytorch.org/rl/stable/reference/generated/torchrl.collectors.SyncDataCollector.html?highlight=syncdatacollector) that fits well the LLM space. We will need to simplify the rollout function greatly - I would rewrite it from scratch. The LLMEnv...

enhancement

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2865

CLA Signed