MicroRTS-Py Faster Convergence

Faster Convergence

Open vwxyzjn opened this issue 3 years ago • 2 comments

Training an agent now still takes a long time. The particular experiment in #36 took 4d 9h 11m 14s to finish.

Looking at the reward chart, it appears the agent could achieve 70% of the final performance in just 50M steps (or about 10 hours into training)

We should try to optimize based on the 10 hours time computational budget.

Jan 27 '22 04:01 vwxyzjn

The bottleneck I think is still largely on the NN side. So one thing worth trying is to reduce the NN size.

Alternatively, I noticed the learning rate annealing, in the end, seems to really help the algorithm converge. So maybe we could also try using a smaller learning rate and just turn off annealing.

Maybe we could tune with the discount factor (we should also visualize the discounted returns (what the agent actually optimized for).

5196CF77-ED9B-43F5-AEF2-C1601A4AAEBC

Jan 27 '22 04:01 vwxyzjn

#56 tries to address this issue.

Jan 31 '22 23:01 vwxyzjn

MicroRTS-Py MicroRTS-Py copied to clipboard

Faster Convergence

MicroRTS-Py
MicroRTS-Py copied to clipboard