DDPG-Keras-Torcs
DDPG-Keras-Torcs copied to clipboard
Training does not learn anything
Hi,
After training for about 75000 steps, only 8 episodes have passed and the agent has not learnt anything useful. My question is: does the current ddpg.py
file have the correct hyper-parameters? The saved .h5
model in this repo does indeed work well, so I was wondering if it was trained using the same hyper-parameters.
A part of the output:
('Episode', 8, 'Step', 76169, 'Action', array([[ 0.65779338, 0.13147314, 0.72898711]]), 'Reward', -0.11919999787368826, 'Loss', 0.76000809669494629)
('Episode', 8, 'Step', 76170, 'Action', array([[ 0.72513585, 0.05947028, 0.72827826]]), 'Reward', -0.23072561310301709, 'Loss', 0.12980197370052338)
('Episode', 8, 'Step', 76171, 'Action', array([[ 0.71874098, 0.12731329, 0.73625542]]), 'Reward', 0.012969128165889354, 'Loss', 0.46787607669830322)
('Episode', 8, 'Step', 76172, 'Action', array([[ 0.6552096 , 0.11575382, 0.72765934]]), 'Reward', -0.0072166273178536217, 'Loss', 0. 9427763819694519)
How the TORCS screen looks right now:
In short, the training does not seem to proceed smoothly. Can you please verify if the current version of the code trains well? If not, I suspect the hyper-parameters may have changed between the model which worked well and this version of the code.
Thanks,
Can you uncomment lines 146 to 155 in gym_torcs.py during training ? I uncomment it during actual testing to prove that the agent did not terminated early during actual test run. Thanks
Thanks, I'll try that. Also roughly how many episode_count
or max_steps
did you have to train for before you arrived at the saved .h5
models ?
For simple track you should able to get some reasonable results within 200 episode_count. Cheers.
Oh, thanks. Which tracks are simple? And which ones are more complicated? I usually run torcs in practice mode. Is that one of the simpler ones?
@sahiliitm mine most of the time started learning only after 40 epoch (with only steering angle), so maybe you just have to train for more episodes.
@saiprabhakar @yanpanlau Hi, I have uncommented the lines 146 to 155 in gym_torcs.py and tried to train my network. It seems that the episodes end quickly in this way. I have passed about 1800 episodes and 86671 steps.The agent seems have learnt how to steer but has a problem to accelerate.
Do I need to train more episodes to learn the acceleration?Also, have you ever tried other reward functions ?
Excuse me for so many questions. Thanks!
The line 146 to 155 in gym_torcs serve 2 purpose 1) The game will reset if the car is outside of the track 2) The game will reset if the car is moving extremely slowly.
I found the learning speed will improve significantly during the initial stage of training when you turn on 1) and 2). For Example, if the car is moving extremely slowly, the replay buffer will store frames which is not interesting (as car is not moving) and thus the gradient update is extremely low and learn slowly.
On Fri, Mar 3, 2017 at 5:31 PM, DavisZuo [email protected] wrote:
@saiprabhakar https://github.com/saiprabhakar @yanpanlau https://github.com/yanpanlau Hi, I have uncommented the lines 146 to 155 in gym_torcs.py and tried to train my network. It seems that the episodes end quickly in this way. I have passed about 1800 episodes and 86671 steps.The agent seems have learnt how to steer but has a problem to accelerate. Do I need to train more episodes to learn the acceleration?Also, have you ever tried other reward functions ? Excuse me for so many questions. Thanks!
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/yanpanlau/DDPG-Keras-Torcs/issues/1#issuecomment-283907785, or mute the thread https://github.com/notifications/unsubscribe-auth/AO1sYydMyHXAoFeTyiuu0HBmMxcfvJQCks5rh93agaJpZM4KYKbL .
Hi,
After training for about 45000 steps, 1000 episodes have passed and the agent has not learnt anything useful.The total reward function does not change.And Loss function has been 0.There seems to be an error, as shown.
This problem has troubled me for a month. Can you answer it? Thank you very much.
Maybe you should uncomment the several lines in gym_torcs.py to ensure the car would not be stuck in the local minimum when training. Those lines were used to terminate the training episodes in the occasions when the car moves out of the track, or moves too slow etc. You can find these lines in the other issues. 408780378 邮箱:[email protected] 签名由 网易邮箱大师 定制 在2018年04月02日 21:05,lingyunli1994 写道: Hi, After training for about 45000 steps, 1000 episodes have passed and the agent has not learnt anything useful.The total reward function does not change.And Loss function has been 0.There seems to be an error, as shown. This problem has troubled me for a month. Can you answer it? Thank you very much. — You are receiving this because you commented. Reply to this email directly, view it on GitHub, or mute the thread.
@DavisZuo Thank you very much! First of all, the problem in the above figure, which I have solved, is the problem of the path name. Second, I have uncommented lines 146 to 155 of the gym_torcs. Finally, I still have problems.The total reward function does not change.And Loss function has been 0.Where is the reward function set? I don't seem to find.Why my car can only go forward and learn not to bend.
@lingyunli1994 I have the same problem recently.Do you have solved this problem? Can you help me .Thanks!
@lingyunli1994 How do you solved this probelm,I have the same problem."fopen(config/graph.xm.) failed". Thanks!
@lingyunli1994 How do you solved this probelm,I have the same problem."fopen(config/graph.xm.) failed". Thanks!
Have you ever solved this probelm?Did the car run successfully? @BCWang93