chainerrl icon indicating copy to clipboard operation
chainerrl copied to clipboard

Performance evaluation and comparison of algorithms

Open muupan opened this issue 8 years ago • 6 comments

It will be great to add performance evaluation and comparisons of algorithms available in ChainerRL.

muupan avatar Jun 10 '17 08:06 muupan

Indeed, I like TRPO with exact second derivatives/Hessian-vector products :) Has nice theoretical properties

ajaytalati avatar Jun 10 '17 09:06 ajaytalati

I agree TRPO is great, but supporting TRPO is off-topic on this issue.

https://github.com/openai/rllab and https://github.com/openai/baselines are doing such evaluation and comparison really well, so it's good to start from them.

muupan avatar Jun 11 '17 01:06 muupan

There' also PyTorch implementations which I'm currently using, (once chainer has second derivatives it will be possible to port these over),

https://github.com/mjacar/pytorch-trpo

https://github.com/ikostrikov/pytorch-trpo

ChainerRL seems to be very promising as an alternative to the openai repos given.

ajaytalati avatar Jun 11 '17 01:06 ajaytalati

Here are DQN's scores on five Atari games https://github.com/muupan/chainerrl/blob/benchmark-dqn/evaluations/visualize.ipynb

muupan avatar Jul 31 '17 21:07 muupan

Added DoubleDQN and PAL.

muupan avatar Aug 02 '17 22:08 muupan

Added DQN with prioritized replay

muupan avatar Aug 04 '17 21:08 muupan