deep-rl-class icon indicating copy to clipboard operation
deep-rl-class copied to clipboard

unit2 - train

Open fardinafdideh opened this issue 1 year ago • 1 comments

  • I added (and commented) the following formula for epsilon calculation which as opposed to the current formula is dependent on the "n_training_episodes" (the two formulas output for some "n_training_episodes" has been shown in the figure), hence regardless of the "n_training_episodes" the epsilon value decays exponentially over the whole range of steps from "max_epsilon" to "min_epsilon": epsilon = max_epsilon * ((min_epsilon/max_epsilon)**(1/(n_training_episodes-1))) ** episode epsilon_exponentialDecay

  • The following lines in the "train" function were removed. "step" variable is unused. The variables "terminated" and "truncated" are evaluated as the output of "env.step(action)" before their first use, so there is no need to be initialized.

    • step = 0
    • terminated = False
    • truncated = False
  • The "for" loop counter, "step", and also "info" were replaced with "_", because they are unused.

fardinafdideh avatar Dec 10 '23 18:12 fardinafdideh

Thanks for pointing this out I’m adding this for the december update

simoninithomas avatar Dec 12 '23 08:12 simoninithomas