reinforcement-learning
reinforcement-learning copied to clipboard
Reinforcement Learning Examples Of Policy Gradients, PPO+GAE, and DDQN Using OpenAI Gym and PyTorch
Reinforcement Learning Examples
Pong environment

Policy Gradients
Checkpoint weights
Lunar Lander environment

Deep Q-Network
Checkpoint weights
Policy Gradients
Checkpoint weights
Cartpole environment

Policy Gradients
Checkpoint weights
Deep Q-Network
Checkpoint weights
Mario environment

Policy Gradients
Checkpoint weights
Plot of average reward per 10 episodes
Double Deep Q-Network
Checkpoint weights
Plot of average reward per 10 episodes
Plot of average reward per 10 episodes
Highway environments
Highway environnment
https://user-images.githubusercontent.com/8986329/123654361-ad2a5b80-d836-11eb-8b43-f1d949e93eca.mp4
Double Deep Q-Network
Checkpoint weights
Merge environnment
https://user-images.githubusercontent.com/8986329/123638313-b5c76580-d827-11eb-9c3b-a757de345d43.mp4
Double Deep Q-Network
Checkpoint weights
Roundabout environnment
https://user-images.githubusercontent.com/8986329/123638588-0212a580-d828-11eb-9b70-397898209223.mp4
Double Deep Q-Network
Checkpoint weights
Intersection environnment
https://user-images.githubusercontent.com/8986329/123638705-24a4be80-d828-11eb-866b-4bfb0b8d1234.mp4
Double Deep Q-Network
Checkpoint weights
Parking environnment
https://user-images.githubusercontent.com/8986329/123638469-e3acaa00-d827-11eb-92db-194c340d3ba4.mp4
Racetrack environnment
https://user-images.githubusercontent.com/8986329/141271898-b30fa0d2-7a78-4ff4-9b0f-8cf996936473.mp4
PyBullet Walker2D environment
https://user-images.githubusercontent.com/8986329/123640398-cb3d8f00-d829-11eb-8619-0688a4035ff7.mp4
Plot of average reward per 50 episodes