deep-rl icon indicating copy to clipboard operation
deep-rl copied to clipboard

Actor network output increases to 1, TORCS, TF 1.0.0

Open Amir-Ramezani opened this issue 7 years ago • 1 comments

Hi,

Thanks for your code.

I tried to use it for training TORCS, however, my result are not good and to be specific after a few steps, actions generated by Actor network increases to 1. and stay there. Similar to the following (for the top 10 for example):

[[ 1. 1. 1.] [ 1. 1. 1.] [ 1. 1. 1.] [ 1. 1. 1.] [ 1. 1. 1.] [ 1. 1. 1.] [ 1. 1. 1.] [ 1. 1. 1.] [ 1. 1. 1.] [ 1. 1. 1.]]

Gradients for that set: [[ 4.80426752e-05 1.51122265e-04 -1.96302353e-05] [ 4.80426752e-05 1.51122265e-04 -1.96302353e-05] [ 4.80426752e-05 1.51122265e-04 -1.96302353e-05] [ 4.80426752e-05 1.51122265e-04 -1.96302353e-05] [ 4.80426752e-05 1.51122265e-04 -1.96302353e-05] [ 4.80426752e-05 1.51122265e-04 -1.96302353e-05] [ 4.80426752e-05 1.51122265e-04 -1.96302353e-05] [ 4.80426752e-05 1.51122265e-04 -1.96302353e-05] [ 4.80426752e-05 1.51122265e-04 -1.96302353e-05] [ 4.80426752e-05 1.51122265e-04 -1.96302353e-05]]

I suspect the problem is some where around the following line:

Combine the gradients here

self.actor_gradients = tf.gradients(self.scaled_out, self.network_params, -self.action_gradient)

Could you tell me what do you think is the problem?

I am using tf 1.0.0 CPU version.

Thanks

Amir-Ramezani avatar Apr 03 '17 07:04 Amir-Ramezani

Hi! I am very interested in this issue. so, could you tell me the details of your solution?

RICEVAGUE avatar Mar 28 '19 07:03 RICEVAGUE