ngroves08

Results 2 issues of ngroves08

I have a project where I would like to know whether the last action applied to the environment came from the agent's policy or from a random action (as a...

Hi there, I think I've found a minor bug in the Random_TF_Enviroment class. I've discovered that it fails when the time_step_spec includes a reward_spec that is a multi-dimensional array (as...

contributions welcome