R Devon Hjelm
R Devon Hjelm
Digging a little, it appears that sometimes the next_timestep variable has None values: https://github.com/deepmind/acme/blob/2871e3216d2ffc2bc0ffea8b6a0e3071897608b9/acme/agents/jax/actors.py#L95 `TimeStep(step_type=, reward=None, discount=None, observation=Somenonzeroarray)`
I think I've tracked down where things are going wrong, at least in the environment loop: https://github.com/deepmind/acme/blob/2871e3216d2ffc2bc0ffea8b6a0e3071897608b9/acme/environment_loop.py#L106 With the `-run_distributed` option, the environment `step` call sometimes returns a timestep with...
I was playing around with the number of agents here: https://github.com/deepmind/acme/blob/2871e3216d2ffc2bc0ffea8b6a0e3071897608b9/examples/baselines/rl_discrete/run_dqn.py#L80 reducing the number of agents seems to make the error appear later in training, but it still appears. With...
I think I know the issue: the environment factory in that example (in fact all examples) returns the same instance of the same environment, so there's some sort of async...
Any idea how to even load these tf saved parameters back into a Jax agent (e.g., from the dqn example)? I'm running into this issue, and so far isn't pretty...
Thanks so much! I was very close: I discovered after much digging through the API that adaptor, and was able to wrap the learner with it, but hadn't completely gotten...
Thanks a bunch, there are some useful magic functions in your example. Here is a working example for the dqn demo: ``` import tensorflow as tf import acme from acme.testing...
Sure. I can flush the logger before terminating. Digging a little I don't think the logger is being flushed at all in the training loop. There is exactly one mention...
Those sound like great suggestions. I'll look into it
So I've done the following, trying to follow the recent changes to the experiment config. I've replaced the config in run_dqn.py with the following code: ``` logger = lambda label,...