muzero-general
muzero-general copied to clipboard
Mean_value plot in Total_reward - Interpretation
Hi,
Can I please ask that what does Mean_value plot and also its significance in reinforcement learning (specific to this algorithm). I tried to understand this from the code but couldn't.
Also, to interpret the results, should we look at the results with zero smoothing or with 0.8/0.9 smoothing with reference to reinforcement learning?
The response will highly be appreciated.
Best Wishes!