MountainCar-v0_DeepRL
MountainCar-v0_DeepRL copied to clipboard
OpenAI MountainCar-v0 DeepRL-based solutions (DQN, DuelingDQN, D3QN)
OpenAI MountainCar-v0 DeepRL-based solutions
Investigation under the development of the master thesis "DeepRL-based Motion Planning for Indoor Mobile Robot Navigation" @ Institute of Systems and Robotics - University of Coimbra (ISR-UC)
Requirements
Module | Software/Hardware |
---|---|
Python IDE | Pycharm |
Deep Learning library | Tensorflow + Keras |
GPU | GeForce GTX 1060 |
Interpreter | Python 3.8 |
Packages | requirements.txt |
To setup Pycharm + Anaconda + GPU, consult the setup file here.
To import the required packages (requirements.txt), download the file into the project folder and type the following instruction in the project environment terminal:
pip install -r requirements.txt
:warning: WARNING :warning:
The training process generates a .txt file that track the network models (in 'tf' and .h5 formats) which achieved the solved requirement of the environment. Additionally, an overview image (graph) of the training procedure is created.
To perform several training procedures, the .txt, .png, and directory names must be change. Otherwise, the information of previous training models will get overwritten, and therefore lost.
Regarding testing the saved network models, if using the .h5 model, a 5 episode training is required to initialize/build the keras.model network. Thus, the warnings above mentioned are also appliable to this situation.
Loading the saved model in 'tf' is the recommended option. After finishing the testing, an overview image (graph) of the training procedure is also generated.
OpenAI MountainCar-v0
Actions:
0 - Push car to the left
1 - No action
2 - Push car to the right
States:
0 - Car position [-1.2, 0.6]
1 - Car velocity [-0.07, 0.07]
Rewards:
Scalar value (-1) for every step taken
Episode termination:
Car position (State 0) == 0.5
Episode length > 200
Solved Requirement:
Average reward of -110.0 over 100 consecutive trials
Deep Q-Network (DQN)
Train | Test | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
Network model used for testing: 'saved_networks/dqn_model20' ('tf' model, also available in .h5)
Dueling DQN
Train | Test | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
Network model used for testing: 'saved_networks/duelingdqn_model172' ('tf' model, also available in .h5)
Dueling Double DQN (D3QN)
Train | Test | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
Network model used for testing: 'saved_networks/d3qn_model300' ('tf' model, also available in .h5)