MPO
MPO copied to clipboard

Published 20 hours ago •

acyclics

→

Metadata

Pytorch implementation of "Maximum a Posteriori Policy Optimization" with Retrace for Discrete gym environments

Reame
Issues

Results 1 MPO issues

Sort by recently updated

q_ret update not used

I have enjoyed your really clean implementation of MPO. Thank you for making it available. I was looking at the critic update and think I may have spotted a bug....

mvindiola1

← Metadata

Stars

Forks

Watchers

Owner

acyclics

Metadata

Pytorch implementation of "Maximum a Posteriori Policy Optimization" with Retrace for Discrete gym environments

Back

MPO MPO copied to clipboard

Metadata

q_ret update not used

← Metadata

Owner

Metadata

MPO
MPO copied to clipboard