chainerrl icon indicating copy to clipboard operation
chainerrl copied to clipboard

Unified PCL

Open ethancaballero opened this issue 8 years ago • 1 comments

See section 5.1 for new more performant update to PCL: https://arxiv.org/pdf/1702.08892.pdf

ethancaballero avatar Jun 12 '17 01:06 ethancaballero

The unified version is already supported by the current implementation (at least in theory). The idea is to use a SharedModel for Q value and 2 heads (without trainable parameters) for pi and V using the formula in the paper.

Here is a quick implementation: https://github.com/lyx-x/chainerrl/blob/ab6cb4f9ff1dd419573d8fa3fc8c05840548d74d/examples/gym/train_pcl_gym.py#L155

I believe we can close this issue.

lyx-x avatar Feb 14 '18 20:02 lyx-x