[Proposal] Inclusion of ActionRepeat wrapper
Proposal
I would like to propose ActionRepeat wrapper that would allow the wrapped environment to repeat step() for the specified number of times.
Motivation
I am working on implementing models like PlaNet and Dreamer, and I'm working with MuJoCo environments mostly. In these implementations, there is almost always a term like action_repeat. I think the proposed wrapper would simplify this line of implementation.
Pitch
Assuming that the overridden step() is called at time-step t, it would return the followings
observation[t + n_repeat], sum(reward[t: t + n_repeat + 1]), terminated, truncated, info[t + n_repeat]
That means, the overridden step() would call the parent step() at least once and it would assert that n_repeat is positive (>=0).
If terminal or truncation is reached within the action repetition, the loop would be exited, and the reward would be summed unto that point while observation and info from that time step would be returned.
Alternatives
No response
Additional context
No response
Checklist
- [X] I have checked that there is no similar issue in the repo
This is an interesting wrapper but we want to limit the number of wrappers we have as it requires more work from use to maintain. As this is relatively simple, I think this should be fine.
Could you make a PR for this with tests? I think this makes the most sense in the v1.0.0 branch in the wrappers/stateful_observations.py file
Hi @pseudo-rnd-thoughts, thank you for your comment. I am actually just starting to get involved into open source collaboration. So, do these steps look good for this pull request that I'll be making?
- Fork
Gymnasiuminto my personal account - Clone it locally into my machine
- Switch to branch v1.0.0
- Add the new class
ActionRepeatinto wrappers/stateful_observations.py - Commit with suitable message
- Push the local version into my forked repo.
- Create a pull request that includes the steps to perform to test out the functionality of my added code.
Thank you for looking into this.