DDPG (Notice: Stop updating!!! https://github.com/m5823779/MotionPlannerUsingDDPG to view the final version)

End to End Mobile Robot Navigation using DDPG (Continuous Control with Deep Reinforcement Learning) based on Tensorflow + Gazebo

(Notice: Stop updating!!! https://github.com/m5823779/MotionPlannerUsingDDPG to view the final version)

Goal: Let robot(turtlebot) navigate to the target(enter green circle)

Input: 10 laser finding Output: 2 action (linear velocity [0 - 1] / angular velocity [-1 - 1]) (action_dimension = 2)

Algorithm: DDPG (Actor with batch normlization Critic without batch normlization) Training env: gazebo

Source code: https://github.com/floodsung/DDPG

Testing result:

Following video is my testing result when action dimension = 1 (only control angular velocity / linear velocity = o.5 m/s)

result is good enough

Following video is my testing result when action dimension = 2 (control both linear velocity and angular velocity)

result is quite bad output will saturate

Following vedio is output First is linear velocity, second is angular velocity the output is always converge to 0

Problem:

When action dimension = 2 action will be saturate(can't navigation)

"Have anyone meet this problem and already solved it?"

reference:

https://arxiv.org/pdf/1703.00420.pdf

https://github.com/floodsung/DDPG

DDPG
DDPG copied to clipboard