Xiaohu Zhu
Xiaohu Zhu
I think ddpg can be added, this algorithm performs better for continuous action space. Look forward. :)
FeUdal Networks for Hierarchical Reinforcement Learning https://arxiv.org/pdf/1703.01161.pdf
Neural Fictitious Self Play https://arxiv.org/abs/1603.01121
Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep RL https://arxiv.org/pdf/1706.00387.pdf
small fix