DeepRL-Agents icon indicating copy to clipboard operation
DeepRL-Agents copied to clipboard

A3C Basic Doom: effect of gamma (Discuss)

Open IbrahimSobh opened this issue 8 years ago • 1 comments
trafficstars

Hi I would like to share with you the effect of gamma on performance

  • I believe that when gamma = 0.99 means that we think that next future states have large effect on our estimation of discounted future rewards.
  • And when gamma = 0.8 means that we think that next future states have smaller effect on our estimation of discounted future rewards.

Correct?

Case 1: gamma = 0.99

doom_basic_all

Case 2: gamma = 0.95

Convergence looks smoother thank 0.99 (what do you think?)

doom_basic_gamma__95

Case 3: gamma = 0.8 Almost no convergence
doom_basic_gamma__08

Finally:

gamma = 0.99 # discount rate for advantage estimation and reward discounting

Is it logical to us 2 different gammas?

IbrahimSobh avatar Mar 18 '17 14:03 IbrahimSobh

Hi @IbrahimSobh , I recently looked into the GAE paper (https://arxiv.org/pdf/1506.02438.pdf) and they indeed used two different parameters, gamma for reward discounting and lambda to smoothly interpolate between different n-step estimators of the advantage function. They found empirically that "the best value of lambda is much lower than the best value of gamma".

mkisantal avatar Nov 19 '17 20:11 mkisantal