MicroRTS-Py icon indicating copy to clipboard operation
MicroRTS-Py copied to clipboard

Faster Convergence

Open vwxyzjn opened this issue 3 years ago • 2 comments

Training an agent now still takes a long time. The particular experiment in #36 took 4d 9h 11m 14s to finish.

Looking at the reward chart, it appears the agent could achieve 70% of the final performance in just 50M steps (or about 10 hours into training)

image

We should try to optimize based on the 10 hours time computational budget.

vwxyzjn avatar Jan 27 '22 04:01 vwxyzjn

The bottleneck I think is still largely on the NN side. So one thing worth trying is to reduce the NN size.

Alternatively, I noticed the learning rate annealing, in the end, seems to really help the algorithm converge. So maybe we could also try using a smaller learning rate and just turn off annealing.

Maybe we could tune with the discount factor (we should also visualize the discounted returns (what the agent actually optimized for).

5196CF77-ED9B-43F5-AEF2-C1601A4AAEBC

vwxyzjn avatar Jan 27 '22 04:01 vwxyzjn

#56 tries to address this issue.

vwxyzjn avatar Jan 31 '22 23:01 vwxyzjn