rl icon indicating copy to clipboard operation
rl copied to clipboard

[Bug/Question] Target workflow for LSTM Modules?

Open dennismalmgren opened this issue 1 year ago • 0 comments

Describe the question

In the test suites, LSTMModules are created with inputs and outputs before the environment is created. As such, the primer is added to the environment before parallelization. In "sample code", environments are created and specs are used to design/shape inputs and outputs to modules. With the environment first workflow the LSTMModule primer ends up being added to the finished parallelenv, but the create primer method ignores any batch dims, causing a runtime error when the transform_observation_spec is run, because the assumption is that the specs are correct.

To Reproduce

Hack one of the SOTA implementations to create an LSTM module. Add a primer at the end.

The question is whether the create_primer should be updated (maybe optionally take a batch spec), or the Primer logic should not assume that specs are correct, or that the workflow should be as in the tests. The way it is causes a bit of a surprise, though.

cc @albertbou92

dennismalmgren avatar Apr 23 '24 18:04 dennismalmgren