rl issues

[CI, Tests] Faster tests

1

## Description Small tweaks to run faster tests

vmoens

CLA Signed

[Feature] QMIX example

1

## Description Parent PR of the QMIX example algorithm

vmoens

enhancement

CLA Signed

[Feature] structured hydra configs

1

## Description Implements training script instantiations from structured hydra configs - [x] Collectors #471 - [x] Envs and transforms #472 - [x] replay buffers #493 - [x] models #496 -...

vmoens

enhancement

CLA Signed

[Feature] MBPO Support

## Description Support for MBPO ## Motivation and Context MBPO is a sample efficient method that improves over SAC. ## Types of changes What types of changes does your code...

nicolas-dufour

CLA Signed

[CodeQuality]: better set comparison

## Description Better set comparison, using `==` whenever possible. ## Motivation and Context Originally, we tested that difference between two sets was 0 and intersection was complete. `==` does the...

vmoens

CLA Signed

quality

[WIP] Transform for multi-agent of variable number

vmoens

CLA Signed

[BUG] Numerical Instability issues with `torchrl.modules.TanhNormal`

5

## Describe the bug When training on `PettingZoo/MultiWalker-v9` with `Multi-Agent Soft Actor-Critic`, **all** losses (`loss_actor`, `loss_qvalue`, `loss_alpha`) explode after ~1M environment steps at most. This phenomenon occurs regardless of (reasonable)...

N00bcak

bug

[Feature Request] multiple GPUs on a single machine

2

## Motivation I want to train using multiple GPUs on a single machine, but I can't find relevant tutorial documentation. Could you provide an example of training using multiple GPUs...

sgfCrazy

enhancement

[Algorithm] Update scripts with compile

1

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * (to be filled)

vmoens

CLA Signed

[Algorithm] Update scripts with compile

2

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * (to be filled)

vmoens

CLA Signed

rl
rl copied to clipboard

Metadata

[CI, Tests] Faster tests

[Feature] QMIX example

[Feature] structured hydra configs

[Feature] MBPO Support

[CodeQuality]: better set comparison

[WIP] Transform for multi-agent of variable number

[BUG] Numerical Instability issues with `torchrl.modules.TanhNormal`

[Feature Request] multiple GPUs on a single machine

[Algorithm] Update scripts with compile

[Algorithm] Update scripts with compile

← Metadata

Owner

Metadata

rl rl copied to clipboard

Metadata

← Metadata

Owner

Metadata

rl
rl copied to clipboard