rl
rl copied to clipboard
[Feature Request] Missing ActionScaling and FlattenAction
Motivation
In general, the action space may be a dict instead of a plain vector, and is not normalized. Even for a classic gym
environment, action wrappers performing action flattening and normalization according the min/max spec are not provided by the standard API, but are implemented by many libraries.
Solution
First, add a normalization transform ActionScaling
to torchrl
that relies on the spec to compute loc
and scale
unlike ObservationNorm
nor moving average like VecNorm
. I would advice raising an exception for completely or partially unbounded action spaces. Second, add an action flattening transform FlattenAction
foing very much the same as FlattenObservation
.
Alternatives
Keep relying on flattening and normalization wrappers at env-level provided by any other library.
Checklist
- [x] I have checked that there is no similar issue in the repo (required)
Got it sounds about right! We could also have a method in observation norm to compute loc and scale based on the spec now that you mention that.
Yes, but it is much less common for observation space to be fully bounded. In any case, the observation distribution under the policy rarely span the whole space in practice, so it would probably be much less frequently used.
Hello,
I would like to work on this Issue, is anyone already working on it?
Just for making sure I interpreted the issue correctly, the goal is implementing those two wrappers for the action space, is that right?
Yes it is ! I don't think someone is working on it for now.