PeiYingjun

Results 4 comments of PeiYingjun

Exactly, I'm trying to rewrite the code

sorry, I mean I think it should be `kl_firstfixed = tf.reduce_sum(tf.stop_gradient( oldaction_dist) * tf.log(tf.stop_gradient(oldaction_dist + eps) / (oldaction_dist + eps))) / Nf`

All right, after a quick analysis, I think it' s reasonable to use the first definition of kl_first, yet I'm still confused about the losses, why do we try to...

> Thank you for your interest! We didn't update our recent version to our master branch. > You should go to dLSTM branch dlstm_a2c folder to check the new one....