agents icon indicating copy to clipboard operation
agents copied to clipboard

Should time_step_spec of an array valued reward return an array valued discount?

Open guachoperez opened this issue 4 years ago • 1 comments

The time_step_spec function only takes observation_spec and reward_spec array specifications, but if the reward_spec specifies a multidimensional array, shouldn't the discount_spec match its shape or at least accept an argument to know if this should be the case?

guachoperez avatar Sep 18 '20 06:09 guachoperez

I am also having the same problem.

Any batched py_environment.PyEnvironment seems to require an array of discount_spec and step_type. @guachoperez Did you find any way round this?

coleridge72 avatar Nov 26 '20 21:11 coleridge72