DDPG
DDPG copied to clipboard
End to End Mobile Robot Navigation using DDPG (Continuous Control with Deep Reinforcement Learning) based on Tensorflow + Gazebo
DDPG (Notice: Stop updating!!! https://github.com/m5823779/MotionPlannerUsingDDPG to view the final version)
End to End Mobile Robot Navigation using DDPG (Continuous Control with Deep Reinforcement Learning) based on Tensorflow + Gazebo
(Notice: Stop updating!!! https://github.com/m5823779/MotionPlannerUsingDDPG to view the final version)
Goal: Let robot(turtlebot) navigate to the target(enter green circle)
Input: 10 laser finding Output: 2 action (linear velocity [0 - 1] / angular velocity [-1 - 1]) (action_dimension = 2)
Algorithm: DDPG (Actor with batch normlization Critic without batch normlization) Training env: gazebo
Source code: https://github.com/floodsung/DDPG
Testing result:
Following video is my testing result when action dimension = 1 (only control angular velocity / linear velocity = o.5 m/s)
result is good enough
Following video is my testing result when action dimension = 2 (control both linear velocity and angular velocity)
result is quite bad
output will saturate
Following vedio is output First is linear velocity, second is angular velocity the output is always converge to 0
Problem:
When action dimension = 2 action will be saturate(can't navigation)
"Have anyone meet this problem and already solved it?"
reference:
https://arxiv.org/pdf/1703.00420.pdf
https://github.com/floodsung/DDPG