rl icon indicating copy to clipboard operation
rl copied to clipboard

A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.

Results 254 rl issues
Sort by recently updated
recently updated
newest added

## Motivation I would suggest adding an option for linear interpolation to the `FrameSkipTransform` ([documentation here](https://pytorch.org/rl/reference/generated/torchrl.envs.transforms.FrameSkipTransform.html#torchrl.envs.transforms.FrameSkipTransform)). This means, instead of repeating the same action `frame_skip` number of times for each...

enhancement

## Describe the bug In the current implementation, all subclasses of `MPCPlannerBase` do not consider `done` thrown by env during the planning process, which means that MPC is invalid in...

bug

New Tensordict keys created in policy modules used in the collector are not returned in the rollout tensordict. The issue is in this line, as setting "out" for torch.stack deletes...

bug

## Motivation I suggest adding an implementation of [TQC](https://github.com/SamsungLabs/tqc_pytorch) to the examples. I suggest adding this as a request in the 'call for distributions' stack. I would be happy to...

enhancement

## Description This adds support for Unity MLAgent environments which is a popular method of creating games for reinforcement learning settings. This pull request is still definitely a work-in-progress, and...

CLA Signed
Environments

Implements MCTS planners. We design an `_MCTSNode` class that represents a node in the graph. It is stateless (or so) in the sense that it does not store information internally...

enhancement
CLA Signed

## Describe the bug Calling `zero()` on the `BoundedTensorSpec` will return a tensor that is out of bounds. Would it make sense to return the minimum instead? Or raise an...

bug

## Motivation In general, the action space may be a dict instead of a plain vector, and is not normalized. Even for a classic `gym` environment, action wrappers performing action...

enhancement
Good first issue

## Description Draft PR to facilitate discussion: The proposed change extracts training steps in wrappers to reduce length of the example, but as there few control logging parameters needed, it...

CLA Signed

merge after #1309, #1319, #1316, #1315 + rebase Adds a complete end 2 end RLHF pipeline

enhancement
CLA Signed