HandyRL icon indicating copy to clipboard operation
HandyRL copied to clipboard

num_parallel affecting learning results

Open spicytomatoes opened this issue 4 years ago • 1 comments

hi, I've tried training on a 32 core machine, naturally i set num_parallel to 32. However the model does not seem to learn at all. Weirdly, when i set num_parallel to 6, the model learns. The rest of the config is exactly the same as the PubHRL config for hungry geese.

spicytomatoes avatar Dec 14 '21 17:12 spicytomatoes

Thanks for your report! We ran several experiments with 64 workers, and all the training was successful. However, it is not easy to learn non-legal moves in this task, and I am sure that training is not stable.

If there is one thing I can say, it is that the PubHRL experiment setup was decided on the first try, so I cannot recommend it with confidence. As I mentioned in the discussion, I think forward_steps=1 is generally better in this kind of task. Also, a larger entropy regularization coefficient would be better.

YuriCat avatar Jan 08 '22 04:01 YuriCat