Yasuhiro Fujita

Results 90 comments of Yasuhiro Fujita

Wow, good catch! The sign seems not correct. Thank you for reporting it. If I understand correctly, `tau` in the paper actually corresponds to `1-tau` in ChainerRL's IQN, because `|tau...

@uezo kindly implemented TicTacToe! http://qiita.com/uezo/items/87b25c93199d72a56a9a#%E5%8F%82%E8%80%83%E3%82%B5%E3%82%A4%E3%83%88

Thank you for the improvements on PCL. I haven't checked the implementation details yet, but I think solving the memory issue is great as long as it won't make training...

Good catch. The problem comes from the fact that resuming agent training via `step_offset` is not well tested.

@ElliotWay Interesting. Which game did you try? When I tuned `train_acer_ale.py`, I found it is much more sample-efficient than A3C on Breakout with the default parameters.

@ElliotWay Thank you. It is possible there has been some regression in ChainerRL. It should be investigated.

~Links cannot be deepcopied after `to_device('native')`. We need to find a workaround or wait until it's fixed. https://github.com/chainer/chainer/issues/5916~ solved

Async training requires https://github.com/chainer/chainer/issues/5931 to be fixed.

``` def deepcopy_link(link): device = link.device link.to_device(np) new_link = copy.deepcopy(link) link.to_device(device) new_link.to_device(device) return new_link ``` This can be a workaround to deepcopy.

Current ChainerX does not support advanced indexing, which prevents from applying it to CategoricalDQN and IQN. https://github.com/chainer/chainer/issues/5944