Avinash Ummadisingu
Results
5
comments of
Avinash Ummadisingu
Here are a couple of differences from the original paper I noticed: - Using target network to pick actions during evaluation. From the paper: > Apart from using the target...
Thanks for this! I think it's quite useful in a number of situations. Since we're adding a public function to the standard buffer, would you be able to expand this...