rl issues

First draft for modular Hindsight Experience Replay Transform

1

## Description I have added the Hindsight Experience Replay Transform specifically implementing the `future` and `last` strategy as described in the paper. The transform is a combination of 3 transforms:...

dtsaras

enhancement

CLA Signed

[Feature Request] Add Maniskill3 Custom Env

5

## Motivation Maniskill3 is gaining a lot of taction recently and it offers great features like parallel GPU vectorized environments, a great set of tasks from simpler to complex and...

AlexandreBrown

enhancement

[DRAFT] ppo chess with llm and ConditionalPolicySwitch to sunfish bot

1

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2763 * #2711 * #2709

mikaylagawarecki

CLA Signed

[Feature] ConditionalPolicySwitch transform

1

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #2763 * __->__ #2711 * #2709

vmoens

enhancement

CLA Signed

[Example] Self-play chess PPO example

1

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #2763 * #2711 * __->__ #2709

vmoens

CLA Signed

Examples

[Feature Request] Allow CUDNN based modules in losses

3

## Motivation Currently we cannot use CUDNN based modules in loss modules as they are incompatible with vmap used in most of the losses. Particularly for RNN modules this leaves...

splatter96

enhancement

[BUG] Maniskill3 crashes on D2H transfer after env rollout

1

## Describe the bug Maniskill3 crashes after env.rollout when transferring data to host (cuda to cpu). ```python for _ in tqdm(range(nb_iters), "Evaluation"): rollouts = self.eval_env.rollout( max_steps=self.env_max_frames_per_traj, policy=policy, auto_reset=False, auto_cast_to_device=False, tensordict=tensordict,...

AlexandreBrown

bug

[BUG] SafeProbabilisticModule constructor missing `log_prob_keys` argument

## Describe the bug When disabling logprob aggregation for a probabilistic actor you are supposed to pass a sequence of `log_prob_keys` as a parameter instead of a single `log_prob_key`. However,...

rerz

bug

[Feature Request] Make sure that all losses work with tensorclasses and regular tensors

2

## Motivation We should add tests for all losses with - [ ] tensors passed via `tensordict.nn.dispatch`. This should be accompanied by an example in the docstrings of how to...

vmoens

enhancement

Good first issue

[WIP] Compute lp during loss execution

3

Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2688 * #2687 * #2665

vmoens

CLA Signed

rl
rl copied to clipboard

Metadata

First draft for modular Hindsight Experience Replay Transform

[Feature Request] Add Maniskill3 Custom Env

[DRAFT] ppo chess with llm and ConditionalPolicySwitch to sunfish bot

[Feature] ConditionalPolicySwitch transform

[Example] Self-play chess PPO example

[Feature Request] Allow CUDNN based modules in losses

[BUG] Maniskill3 crashes on D2H transfer after env rollout

[BUG] SafeProbabilisticModule constructor missing `log_prob_keys` argument

[Feature Request] Make sure that all losses work with tensorclasses and regular tensors

[WIP] Compute lp during loss execution

← Metadata

Owner

Metadata

rl rl copied to clipboard

Metadata

← Metadata

Owner

Metadata

rl
rl copied to clipboard