Aleksei Petrenko
Aleksei Petrenko
I think this is better asked in the swarm-rl repo. Also, would be great if you could post the whole error trace. For now, I suggest changing `--replay_buffer_sample_prob=0.75` to `--replay_buffer_sample_prob=0`...
@GoingMyWay thank you for reporting! @BoyuanLong can you please take a look?
@erikwijmans is right. The loop is trying to iterate self.env_runners which is None. It is, of course, not supposed to be None. Very likely something happened earlier in the log,...
This is not ready to merge yet, right?
Hello! SampleFactory is not a simulator, but a reinforcement learning algorithm. If you're looking for fast simulators, check out our Megaverse: https://www.megaverse.info/ It supports multi-agent training at 10^5-10^6 samples per...
Megaverse is the RL environment. I believe it should be similar to using any other RL environment with this algorithm. There is not documentation fort this specific algorithm
Hi! Can you explain what exactly you mean by a deterministic episode? I.e. you have a trained policy, and you want to evaluate it in a way that is consistent...
Hi @nathanlct I'd say the easiest way is to modify the actor worker class to make one of them "special" in some way. I.e. you can make actor_worker #0 into...
Hi @nathanlct ! Sorry for the delay > I am trying to put something together but am still struggling to understand how the code works. > > So from what...
@edbeeching maybe you can take a look if you have time! :)