Performance evaluation and comparison of algorithms
It will be great to add performance evaluation and comparisons of algorithms available in ChainerRL.
Indeed, I like TRPO with exact second derivatives/Hessian-vector products :) Has nice theoretical properties
I agree TRPO is great, but supporting TRPO is off-topic on this issue.
https://github.com/openai/rllab and https://github.com/openai/baselines are doing such evaluation and comparison really well, so it's good to start from them.
There' also PyTorch implementations which I'm currently using, (once chainer has second derivatives it will be possible to port these over),
https://github.com/mjacar/pytorch-trpo
https://github.com/ikostrikov/pytorch-trpo
ChainerRL seems to be very promising as an alternative to the openai repos given.
Here are DQN's scores on five Atari games https://github.com/muupan/chainerrl/blob/benchmark-dqn/evaluations/visualize.ipynb
Added DoubleDQN and PAL.
Added DQN with prioritized replay