DeepRL-Agents
DeepRL-Agents copied to clipboard
A3C Doom Basic: Skip Count
hi
Based on ViZDoom paper figure 7, I tried to use skip count to speed up the training, as follows:
r = self.env.make_action(self.actions[a], 4) / 100.0
However, the agent performed very poorly (compared to the original code and performance where it should converge around 200~300)
The ideal average episode length should be around 30, however it is around 70
I think the agent found some sub-optimal policy and stuck to it!
I made this small change and results are much more logical:
instead of
r = self.env.make_action(self.actions[a],4) / 100.0
I used this to keep the values balanced as the original case (if skip count = 4 then divide by 400 instead of 100)
r = self.env.make_action(self.actions[a],4) / 400.0
Results:
My Conclusion
- Using Skip count = 4, the agent converged at 400 after around 5 mins
- While using no skip count (the original code), the agent converged at 250 after around 10 mins
Due to the skip counts, the episodes finish much faster, this is why we needed more episodes when using skip count = 4, but on the other hand, we needed much less time to converge (2X speed up) ....
What do you think?