Randomized-Ensembled-Double-Q-learning-REDQ-
Randomized-Ensembled-Double-Q-learning-REDQ- copied to clipboard
Pytorch implementation of Randomized Ensembled Double Q-learning (REDQ)
Results
2
Randomized-Ensembled-Double-Q-learning-REDQ- issues
Sort by
recently updated
recently updated
newest added
Should the actor update not utilise idx[0] and idx[1] for Q1 and Q2? currently it just gets the same value of Q from the same critic # ---------------------------- update actor...
Hi, I was wondering that have you applied this idea to environments with discrete action space? or have any idea how it would perform? Thanks