rl icon indicating copy to clipboard operation
rl copied to clipboard

[Feature Request] partial steps in batches envs

Open vmoens opened this issue 1 year ago • 1 comments

Motivation

In many scenarios we need to perform a step only on a subset of batched envs. This includes collecting a complete trajectory for many envs when they end asynchronously, or partial frame skip and such.

Solution

Serial and parallel envs could read an index key that would indicate which env is to be reset/stepped over. We need to decide if this key will be a bool or long tensor, which name it'll have, whether it'll be private or not. For sure users will need to be able to mask the data so we'll need to provide a mask indicating what data is valid.

Alternatives

Eventually we could also index batched envs directly but for now this is a long stretch.

Cc @albertbou92

vmoens avatar Feb 03 '24 09:02 vmoens

Where would the index key be set? in a Transform? or the environment itself would do it?

I think the key could be a bool. named padded step or skipped step maybe?

albertbou92 avatar Feb 08 '24 08:02 albertbou92