rl issues

[Feature Request] `EarlyStopping` for `torchrl.trainers.Trainer`

## Motivation Often times we only want to train an algorithm until it learned the intended behavior, and a total number of frames is only a proxy for the stopping...

jkrude

enhancement

[Feature] MCTS Scoring functions

3

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2358

vmoens

enhancement

CLA Signed

[BUG] Clarify non-opitionality of `TensordictPrimer`

## Without the primer, the collector does not feed any hidden state to the policy in the [RNN tutorial ](https://github.com/pytorch/rl/blob/main/tutorials/sphinx-tutorials/dqn_with_rnn.py)it is stated that the primer is optional and it is...

matteobettini

bug

[Feature Request] MCTS Issue tracker

- [ ] `break_when_all_done` in `env.rollout()` #2355 - [ ] Partial steps in env #2356 - [ ] `BatchedEnv`: pass the indices of envs where a step should be done...

vmoens

enhancement

[Feature] TensorSpec.enumerate()

1

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #2359 * #2358 * __->__ #2354 * #2307 * #2306 * #2305 * #2304

vmoens

CLA Signed

[Feature Request] Partial steps in env

## Motivation #2355 would be much cleaner if we could do partial steps in batched or stateless envs. ### Design question - Should we index the batched env to make...

vmoens

enhancement

[DRAFT] Initial LLM collector

1

## Description Add LLM Collector ## Motivation and Context #2872 ## Types of changes What types of changes does your code introduce? Remove all that do not apply: - [...

Lucaskabela

CLA Signed

[BUG] RuntimeError when passing dialogue data to LLMEnv

1

## Describe the bug I see very cool advancements in the direction of LLM RL training in the repo, awesome work! :) After playing a bit with the LLMEnv I...

albertbou92

bug

[Feature Request] A collector designed for LLMs

## Motivation We need a [collector](https://pytorch.org/rl/stable/reference/generated/torchrl.collectors.SyncDataCollector.html?highlight=syncdatacollector) that fits well the LLM space. We will need to simplify the rollout function greatly - I would rewrite it from scratch. The LLMEnv...

vmoens

enhancement

v0 param server (using collectives not object store)

1

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2865

mikaylagawarecki

CLA Signed

rl
rl copied to clipboard

Metadata

[Feature Request] `EarlyStopping` for `torchrl.trainers.Trainer`

[Feature] MCTS Scoring functions

[BUG] Clarify non-opitionality of `TensordictPrimer`

[Feature Request] MCTS Issue tracker

[Feature] TensorSpec.enumerate()

[Feature Request] Partial steps in env

[DRAFT] Initial LLM collector

[BUG] RuntimeError when passing dialogue data to LLMEnv

[Feature Request] A collector designed for LLMs

v0 param server (using collectives not object store)

← Metadata

Owner

Metadata

rl rl copied to clipboard

Metadata

← Metadata

Owner

Metadata

rl
rl copied to clipboard