DeepRL-Agents
DeepRL-Agents copied to clipboard
A3C Basic Doom: effect of gamma (Discuss)
Hi I would like to share with you the effect of gamma on performance
- I believe that when gamma = 0.99 means that we think that next future states have large effect on our estimation of discounted future rewards.
- And when gamma = 0.8 means that we think that next future states have smaller effect on our estimation of discounted future rewards.
Correct?
Case 1: gamma = 0.99
Case 2: gamma = 0.95
Convergence looks smoother thank 0.99 (what do you think?)
Case 3: gamma = 0.8
Almost no convergence
Finally:
gamma = 0.99 # discount rate for advantage estimation and reward discounting
Is it logical to us 2 different gammas?
Hi @IbrahimSobh , I recently looked into the GAE paper (https://arxiv.org/pdf/1506.02438.pdf) and they indeed used two different parameters, gamma for reward discounting and lambda to smoothly interpolate between different n-step estimators of the advantage function. They found empirically that "the best value of lambda is much lower than the best value of gamma".