policy-value-methods icon indicating copy to clipboard operation
policy-value-methods copied to clipboard

Deep Reinforcement Learning algorithms for Policy Value methods written from scratch.

policy-value-methods

My implementation on bunch of policy value methods from scratch

Algorithms:

  1. Hill Climb
  2. Cross Entropy Method
  3. Policy Gradient Methods
    1. REINFORCE
    2. PPO (Proximal Policy Optimization) Video
    3. Actor Critic

Results:

LunarLander (REINFORCE) {Solved in 519 episodes}

BipedalWalker-v3 (TD3) {completion time ~14seconds, achieved after 500 episodes}

Score

Rolling score