DDPG-Keras-Torcs
DDPG-Keras-Torcs copied to clipboard
Does the output num of critic network (Q-value) should be 1? But the code is 3?
Hi, I wonder to know the num of Q-value in critic network should be 1 or 3? It is 3 in code,but I don't know the reason. Thank you.
I think it should be 1. Don't know why 3
Yes, I observed this too. Line 54 of the critic should be: V = Dense(1,activation='linear')(h3) instead of V = Dense(action_dim,activation='linear')(h3)
Note that in the Bellman equation we have r + gamma*Q. r is a scalar, and so Q must also be a scalar, otherwise we will end up having 3 Bellman equations!
I agree, in line 116 of ddpg.py:y_t = np.asarray([e[1] for e in batch]) is wrong. The right y_t should be y_t = np.asarray([e[2] for e in batch])
I agree, in line 116 of ddpg.py:y_t = np.asarray([e[1] for e in batch]) is wrong. The right y_t should be y_t = np.asarray([e[2] for e in batch])
Do you reproduce the model which is close to the model given by the author?According to the modification suggestions given in other issues, I have been able to train the model, but the car does not run very well. The car often goes out of the track and cannot return to the track. Do you have any good suggestions? @QQwaken @guo253 @kaushikb258