rl
rl copied to clipboard
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
## Description I have added the Hindsight Experience Replay Transform specifically implementing the `future` and `last` strategy as described in the paper. The transform is a combination of 3 transforms:...
## Motivation Maniskill3 is gaining a lot of taction recently and it offers great features like parallel GPU vectorized environments, a great set of tasks from simpler to complex and...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2763 * #2711 * #2709
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #2763 * __->__ #2711 * #2709
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * #2763 * #2711 * __->__ #2709
## Motivation Currently we cannot use CUDNN based modules in loss modules as they are incompatible with vmap used in most of the losses. Particularly for RNN modules this leaves...
## Describe the bug Maniskill3 crashes after env.rollout when transferring data to host (cuda to cpu). ```python for _ in tqdm(range(nb_iters), "Evaluation"): rollouts = self.eval_env.rollout( max_steps=self.env_max_frames_per_traj, policy=policy, auto_reset=False, auto_cast_to_device=False, tensordict=tensordict,...
## Describe the bug When disabling logprob aggregation for a probabilistic actor you are supposed to pass a sequence of `log_prob_keys` as a parameter instead of a single `log_prob_key`. However,...
## Motivation We should add tests for all losses with - [ ] tensors passed via `tensordict.nn.dispatch`. This should be accompanied by an example in the docstrings of how to...
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #2688 * #2687 * #2665