rl issues

Document the TensorDict structure of the return of the _step() function for a multi agent environment

2

## Motivation It is not very clear what should be the structure of TensorDict of the return of the _step() function for a multi agent environment. If there are two...

zoetsekas

enhancement

[Doc] Better doc for distributed RBs

3

vmoens

CLA Signed

[BUG] remove error catches (try/except) in objectives

2

We still have a bunch of try/except in losses such as PPO to compute the entropy. We need to remove them for compile compatibility.

vmoens

bug

[BUG] sota-implementation requires nightly built

2

## Describe the bug dqn_cartpole from sota-implementations/dqn doesn't working. Crashes with: **ImportError: cannot import name 'Composite' from 'torchrl.data'** ## To Reproduce Just run dqn_cartpole.py, it will call ulits_cartpole.py and then...

jimdor

bug

[Feature] use_vmap=False for SAC

3

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #2393 * __->__ #2392

vmoens

enhancement

CLA Signed

[Feature] non-functional SAC loss

3

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2393 * #2392

vmoens

CLA Signed

[Feature Request] Extend TDLambdaEstimator with QLambdaEstimator

## Motivation Attempting to implement [Parallel Q Networks](https://www.researchgate.net/publication/382080747_Simplifying_Deep_Temporal_Difference_Learning) (online DQN without replay buffer or target networks). Uses QLambda returns. ## Solution TDLambdaEstimator expects `state_value` keys but we would now need...

roger-creus

enhancement

[Feature] flexible batch_locked for jumanji

3

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2382

vmoens

enhancement

CLA Signed

Environments

[Algorithm] TD3 fast

3

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2389

vmoens

CLA Signed

[Feature Request] Metadata for specs

2

> Hi there. Besides the naming, what do you think of adding some metadata for users to populate? > > This could be useful for, for example, marking which keys...

vmoens

rl
rl copied to clipboard

Metadata

Document the TensorDict structure of the return of the _step() function for a multi agent environment

[Doc] Better doc for distributed RBs

[BUG] remove error catches (try/except) in objectives

[BUG] sota-implementation requires nightly built

[Feature] use_vmap=False for SAC

[Feature] non-functional SAC loss

[Feature Request] Extend TDLambdaEstimator with QLambdaEstimator

[Feature] flexible batch_locked for jumanji

[Algorithm] TD3 fast

[Feature Request] Metadata for specs

← Metadata

Owner

Metadata

rl rl copied to clipboard

Metadata

← Metadata

Owner

Metadata

rl
rl copied to clipboard