rl issues

[Feature Request] Add linear interpolation to FrameSkipTransform

3

## Motivation I would suggest adding an option for linear interpolation to the `FrameSkipTransform` ([documentation here](https://pytorch.org/rl/reference/generated/torchrl.envs.transforms.FrameSkipTransform.html#torchrl.envs.transforms.FrameSkipTransform)). This means, instead of repeating the same action `frame_skip` number of times for each...

maxweissenbacher

enhancement

[BUG] Planning (`MPCPlannerBase`) should consider `done`

9

## Describe the bug In the current implementation, all subclasses of `MPCPlannerBase` do not consider `done` thrown by env during the planning process, which means that MPC is invalid in...

FrankTianTT

bug

[BUG] Collector deletes new tensordict keys

6

New Tensordict keys created in policy modules used in the collector are not returned in the rollout tensordict. The issue is in this line, as setting "out" for torch.stack deletes...

Acciorocketships

bug

[Feature Request] Implement TQC for the example algorithms

6

## Motivation I suggest adding an implementation of [TQC](https://github.com/SamsungLabs/tqc_pytorch) to the examples. I suggest adding this as a request in the 'call for distributions' stack. I would be happy to...

maxweissenbacher

enhancement

[Feature] Add support for Unity MLAgents environments.

31

## Description This adds support for Unity MLAgent environments which is a popular method of creating games for reinforcement learning settings. This pull request is still definitely a work-in-progress, and...

hyerra

CLA Signed

Environments

[WIP, Feature] MCTS

7

Implements MCTS planners. We design an `_MCTSNode` class that represents a node in the graph. It is stateless (or so) in the sense that it does not store information internally...

vmoens

enhancement

CLA Signed

[BUG] BoundedTensorSpec zero() returns out of bounds tensor

1

## Describe the bug Calling `zero()` on the `BoundedTensorSpec` will return a tensor that is out of bounds. Would it make sense to return the minimum instead? Or raise an...

smorad

bug

[Feature Request] Missing ActionScaling and FlattenAction

4

## Motivation In general, the action space may be a dict instead of a plain vector, and is not normalized. Even for a classic `gym` environment, action wrappers performing action...

duburcqa

enhancement

Good first issue

[Refactor] Dreamer v1 refactor

## Description Draft PR to facilitate discussion: The proposed change extracts training steps in wrappers to reduce length of the example, but as there few control logging parameters needed, it...

MateuszGuzek

CLA Signed

[Example] RLHF end to end example

merge after #1309, #1319, #1316, #1315 + rebase Adds a complete end 2 end RLHF pipeline

apbard

enhancement

CLA Signed

rl
rl copied to clipboard

Metadata

[Feature Request] Add linear interpolation to FrameSkipTransform

[BUG] Planning (`MPCPlannerBase`) should consider `done`

[BUG] Collector deletes new tensordict keys

[Feature Request] Implement TQC for the example algorithms

[Feature] Add support for Unity MLAgents environments.

[WIP, Feature] MCTS

[BUG] BoundedTensorSpec zero() returns out of bounds tensor

[Feature Request] Missing ActionScaling and FlattenAction

[Refactor] Dreamer v1 refactor

[Example] RLHF end to end example

← Metadata

Owner

Metadata

rl rl copied to clipboard

Metadata

← Metadata

Owner

Metadata

rl
rl copied to clipboard