Guy Davidson
Guy Davidson
Another question which I'll tack onto here -- the default value for `target-update` parameter is 8000, which matches Table 1 in the Rainbow paper, which reports it as 32k frames....
I think that makes sense. I don't know why you'd evaluate over a fixed number of frames rather than episodes. You could make a TODO to eventually implement their evaluation...
Thanks for the clarification. There appears to be so much voodoo around the implementation details that make it quite hard to know when you can trust your results. It's interesting...
@Kaixhin -- it seems that most papers also evaluate on either no-op starts or human starts. Did you ever take a stab at implementing either?
Ah, I see, in env.reset(). That makes sense.
I've implemented something to this effect just by pickling the memory and loading a checkpoint. My code is a little coupled to where and how I store these saved files,...
See https://github.com/Kaixhin/Rainbow/pull/58 for the implementation details. I guess I now made checkpointing true by default and at the same interval as the evaluation interval, but it doesn't have to be...
Updating minerl and gym by itself didn't do it, but then I also forced a reinstall of adoptopenjdk8 through brew got it past this step. However, it then failed on...
@Filco306 I honestly don't remember. One thing you could try if you didn't: working in a brand new conda environment, rather than an existing one, if you aren't already. This...
Ah cool, thank you! On Tue, Aug 18, 2020 at 12:25 PM Filip Cornell wrote: > @guydav We managed to solve it. It was the > java version - in...