RL-Learner-Lucky

Results 4 comments of RL-Learner-Lucky

which paper? can you tell me

i can implenment it on dqn sucessfully,but failed with ppo

`temp_grad2=[g + 0 for g in temp_grads] ` because temp_grads are tf.VariableSynchronization.ON_READ,this operation triggers on_read event,temp2_grad should be the agrregation value of all replicas in different devies.But in fact temp2_grad...

i run the code with default parameters,so it use the notbasicset files of yours. i can also finds some cards whose effects are not implemented.