DDPG-Keras-Torcs Training does not learn anything

Hi,

After training for about 75000 steps, only 8 episodes have passed and the agent has not learnt anything useful. My question is: does the current ddpg.py file have the correct hyper-parameters? The saved .h5 model in this repo does indeed work well, so I was wondering if it was trained using the same hyper-parameters.

A part of the output:

('Episode', 8, 'Step', 76169, 'Action', array([[ 0.65779338,  0.13147314,  0.72898711]]), 'Reward', -0.11919999787368826, 'Loss', 0.76000809669494629)
('Episode', 8, 'Step', 76170, 'Action', array([[ 0.72513585,  0.05947028,  0.72827826]]), 'Reward', -0.23072561310301709, 'Loss', 0.12980197370052338)
('Episode', 8, 'Step', 76171, 'Action', array([[ 0.71874098,  0.12731329,  0.73625542]]), 'Reward', 0.012969128165889354, 'Loss', 0.46787607669830322)
('Episode', 8, 'Step', 76172, 'Action', array([[ 0.6552096 ,  0.11575382,  0.72765934]]), 'Reward', -0.0072166273178536217, 'Loss', 0.                9427763819694519)

How the TORCS screen looks right now: selection_137

In short, the training does not seem to proceed smoothly. Can you please verify if the current version of the code trains well? If not, I suspect the hyper-parameters may have changed between the model which worked well and this version of the code.

Thanks,

Oct 17 '16 01:10 sahiliitm

Can you uncomment lines 146 to 155 in gym_torcs.py during training ? I uncomment it during actual testing to prove that the agent did not terminated early during actual test run. Thanks

Oct 17 '16 02:10 yanpanlau

Thanks, I'll try that. Also roughly how many episode_count or max_steps did you have to train for before you arrived at the saved .h5 models ?

Oct 17 '16 02:10 sahiliitm

For simple track you should able to get some reasonable results within 200 episode_count. Cheers.

Oct 17 '16 02:10 yanpanlau

Oh, thanks. Which tracks are simple? And which ones are more complicated? I usually run torcs in practice mode. Is that one of the simpler ones?

Oct 17 '16 02:10 sahiliitm

@sahiliitm mine most of the time started learning only after 40 epoch (with only steering angle), so maybe you just have to train for more episodes.

Nov 07 '16 06:11 saiprabhakar

@saiprabhakar @yanpanlau Hi, I have uncommented the lines 146 to 155 in gym_torcs.py and tried to train my network. It seems that the episodes end quickly in this way. I have passed about 1800 episodes and 86671 steps.The agent seems have learnt how to steer but has a problem to accelerate.
Do I need to train more episodes to learn the acceleration?Also, have you ever tried other reward functions ? Excuse me for so many questions. Thanks!

Mar 03 '17 09:03 DavisZuo

The line 146 to 155 in gym_torcs serve 2 purpose 1) The game will reset if the car is outside of the track 2) The game will reset if the car is moving extremely slowly.

I found the learning speed will improve significantly during the initial stage of training when you turn on 1) and 2). For Example, if the car is moving extremely slowly, the replay buffer will store frames which is not interesting (as car is not moving) and thus the gradient update is extremely low and learn slowly.

On Fri, Mar 3, 2017 at 5:31 PM, DavisZuo [email protected] wrote:

@saiprabhakar https://github.com/saiprabhakar @yanpanlau https://github.com/yanpanlau Hi, I have uncommented the lines 146 to 155 in gym_torcs.py and tried to train my network. It seems that the episodes end quickly in this way. I have passed about 1800 episodes and 86671 steps.The agent seems have learnt how to steer but has a problem to accelerate. Do I need to train more episodes to learn the acceleration?Also, have you ever tried other reward functions ? Excuse me for so many questions. Thanks!

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/yanpanlau/DDPG-Keras-Torcs/issues/1#issuecomment-283907785, or mute the thread https://github.com/notifications/unsubscribe-auth/AO1sYydMyHXAoFeTyiuu0HBmMxcfvJQCks5rh93agaJpZM4KYKbL .

Mar 03 '17 09:03 yanpanlau

Hi, After training for about 45000 steps, 1000 episodes have passed and the agent has not learnt anything useful.The total reward function does not change.And Loss function has been 0.There seems to be an error, as shown. 471807239498985096

This problem has troubled me for a month. Can you answer it? Thank you very much.

Apr 02 '18 13:04 lingyunli1994

Maybe you should uncomment the several lines in gym_torcs.py to ensure the car would not be stuck in the local minimum when training. Those lines were used to terminate the training episodes in the occasions when the car moves out of the track, or moves too slow etc. You can find these lines in the other issues. 408780378 邮箱：[email protected] 签名由网易邮箱大师定制在2018年04月02日 21:05，lingyunli1994 写道： Hi, After training for about 45000 steps, 1000 episodes have passed and the agent has not learnt anything useful.The total reward function does not change.And Loss function has been 0.There seems to be an error, as shown. This problem has troubled me for a month. Can you answer it? Thank you very much. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.

Apr 03 '18 05:04 DavisZuo

@DavisZuo Thank you very much! First of all, the problem in the above figure, which I have solved, is the problem of the path name. Second, I have uncommented lines 146 to 155 of the gym_torcs. Finally, I still have problems.The total reward function does not change.And Loss function has been 0.Where is the reward function set? I don't seem to find.Why my car can only go forward and learn not to bend.

Apr 03 '18 06:04 lingyunli1994

@lingyunli1994 I have the same problem recently.Do you have solved this problem? Can you help me .Thanks!

Nov 12 '18 13:11 BCWang93

@lingyunli1994 How do you solved this probelm,I have the same problem."fopen(config/graph.xm.) failed". Thanks!

Nov 12 '18 14:11 BCWang93

@lingyunli1994 How do you solved this probelm,I have the same problem."fopen(config/graph.xm.) failed". Thanks!

Have you ever solved this probelm?Did the car run successfully？ @BCWang93

Jan 05 '21 03:01 Maxwell2017

DDPG-Keras-Torcs DDPG-Keras-Torcs copied to clipboard

Training does not learn anything

DDPG-Keras-Torcs
DDPG-Keras-Torcs copied to clipboard