rl [Feature Request] Add a Transform that remembers the previous action taken

[Feature Request] Add a Transform that remembers the previous action taken

Open maxweissenbacher opened this issue 1 year ago • 4 comments

Motivation

I would propose implementing a simple Transform that remembers the previous action taken. This can be very useful when considering environments with time delay, or when training agents with recurrent policies (for instance, the policy could be a function not only of the state but also the action taken in the time step prior).

Solution

I'd be happy to implement this if you think this could be useful.

The Transform could simply add a new key "previous_action" (or otherwise chosen by user) to the "next" tensordict which copies the "action" from the tensordict and initialises to zero (or another default value specified by the user).

Checklist

[x] I have checked that there is no similar issue in the repo (required)

Nov 06 '23 17:11 maxweissenbacher

Sure that wouldn't be too difficult. The _step function of your transform will take the action from the tensordict input and copy in the next_tensordict and that is all! You will just need to copy the action spec within the observation_spec with the appropriate key.

Nov 06 '23 18:11 vmoens

I'll get to work on it :)

Nov 07 '23 10:11 maxweissenbacher

note to self: the CatFrames transform does this (more generally, it implements a buffer) with observations instead of actions.

Nov 07 '23 18:11 maxweissenbacher

Do we need a time lag for this? Otherwise RenameTransform could take care of it (if we just need the last action)

Dec 07 '23 16:12 vmoens

rl rl copied to clipboard

[Feature Request] Add a Transform that remembers the previous action taken

Motivation

Solution

Checklist

rl
rl copied to clipboard