chainerrl Performance evaluation and comparison of algorithms

It will be great to add performance evaluation and comparisons of algorithms available in ChainerRL.

Jun 10 '17 08:06 muupan

Indeed, I like TRPO with exact second derivatives/Hessian-vector products :) Has nice theoretical properties

Jun 10 '17 09:06 ajaytalati

I agree TRPO is great, but supporting TRPO is off-topic on this issue.

https://github.com/openai/rllab and https://github.com/openai/baselines are doing such evaluation and comparison really well, so it's good to start from them.

Jun 11 '17 01:06 muupan

There' also PyTorch implementations which I'm currently using, (once chainer has second derivatives it will be possible to port these over),

https://github.com/mjacar/pytorch-trpo

https://github.com/ikostrikov/pytorch-trpo

ChainerRL seems to be very promising as an alternative to the openai repos given.

Jun 11 '17 01:06 ajaytalati

Here are DQN's scores on five Atari games https://github.com/muupan/chainerrl/blob/benchmark-dqn/evaluations/visualize.ipynb

Jul 31 '17 21:07 muupan

Added DoubleDQN and PAL.

Aug 02 '17 22:08 muupan

Added DQN with prioritized replay

Aug 04 '17 21:08 muupan