Avinash Ummadisingu

Results 5 comments of Avinash Ummadisingu

Here are a couple of differences from the original paper I noticed: - Using target network to pick actions during evaluation. From the paper: > Apart from using the target...

Thanks for this! I think it's quite useful in a number of situations. Since we're adding a public function to the standard buffer, would you be able to expand this...