chainerrl
chainerrl copied to clipboard
interpolated policy gradient (IPG)
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep RL
https://arxiv.org/pdf/1706.00387.pdf