agents Should time_step_spec of an array valued reward return an array valued discount?

Should time_step_spec of an array valued reward return an array valued discount?

Open guachoperez opened this issue 4 years ago • 1 comments

The time_step_spec function only takes observation_spec and reward_spec array specifications, but if the reward_spec specifies a multidimensional array, shouldn't the discount_spec match its shape or at least accept an argument to know if this should be the case?

Sep 18 '20 06:09 guachoperez

I am also having the same problem.

Any batched py_environment.PyEnvironment seems to require an array of discount_spec and step_type. @guachoperez Did you find any way round this?

Nov 26 '20 21:11 coleridge72

agents agents copied to clipboard

Should time_step_spec of an array valued reward return an array valued discount?

agents
agents copied to clipboard