rl icon indicating copy to clipboard operation
rl copied to clipboard

[Feature Request] Missing ActionScaling and FlattenAction

Open duburcqa opened this issue 1 year ago • 4 comments

Motivation

In general, the action space may be a dict instead of a plain vector, and is not normalized. Even for a classic gym environment, action wrappers performing action flattening and normalization according the min/max spec are not provided by the standard API, but are implemented by many libraries.

Solution

First, add a normalization transform ActionScaling to torchrl that relies on the spec to compute loc and scale unlike ObservationNorm nor moving average like VecNorm. I would advice raising an exception for completely or partially unbounded action spaces. Second, add an action flattening transform FlattenAction foing very much the same as FlattenObservation.

Alternatives

Keep relying on flattening and normalization wrappers at env-level provided by any other library.

Checklist

  • [x] I have checked that there is no similar issue in the repo (required)

duburcqa avatar May 30 '23 18:05 duburcqa

Got it sounds about right! We could also have a method in observation norm to compute loc and scale based on the spec now that you mention that.

vmoens avatar May 31 '23 17:05 vmoens

Yes, but it is much less common for observation space to be fully bounded. In any case, the observation distribution under the policy rarely span the whole space in practice, so it would probably be much less frequently used.

duburcqa avatar May 31 '23 18:05 duburcqa

Hello,

I would like to work on this Issue, is anyone already working on it?

Just for making sure I interpreted the issue correctly, the goal is implementing those two wrappers for the action space, is that right?

PSXBRosa avatar Jul 11 '23 09:07 PSXBRosa

Yes it is ! I don't think someone is working on it for now.

duburcqa avatar Jul 18 '23 06:07 duburcqa