rl
rl copied to clipboard
[Feature Request] Add a Transform that remembers the previous action taken
Motivation
I would propose implementing a simple Transform that remembers the previous action taken. This can be very useful when considering environments with time delay, or when training agents with recurrent policies (for instance, the policy could be a function not only of the state but also the action taken in the time step prior).
Solution
I'd be happy to implement this if you think this could be useful.
The Transform could simply add a new key "previous_action" (or otherwise chosen by user) to the "next" tensordict which copies the "action" from the tensordict and initialises to zero (or another default value specified by the user).
Checklist
- [x] I have checked that there is no similar issue in the repo (required)
Sure that wouldn't be too difficult. The _step function of your transform will take the action from the tensordict
input and copy in the next_tensordict
and that is all!
You will just need to copy the action spec within the observation_spec with the appropriate key.
I'll get to work on it :)
note to self: the CatFrames transform does this (more generally, it implements a buffer) with observations instead of actions.
Do we need a time lag for this? Otherwise RenameTransform could take care of it (if we just need the last action)