Yasuhiro Fujita comments

Results 90 comments of


                                            Yasuhiro Fujita

Question about quantile huber loss function in IQN

Wow, good catch! The sign seems not correct. Thank you for reporting it. If I understand correctly, `tau` in the paper actually corresponds to `1-tau` in ChainerRL's IQN, because `|tau...

Multi-agent example

@uezo kindly implemented TicTacToe! http://qiita.com/uezo/items/87b25c93199d72a56a9a#%E5%8F%82%E8%80%83%E3%82%B5%E3%82%A4%E3%83%88

Rewrite PCL agent

Thank you for the improvements on PCL. I haven't checked the implementation details yet, but I think solving the memory issue is great as long as it won't make training...

Some agents with online updates fail when used with step_offset of train_agent

Good catch. The problem comes from the fact that resuming agent training via `step_offset` is not well tested.

Add simpler examples

@ElliotWay Interesting. Which game did you try? When I tuned `train_acer_ale.py`, I found it is much more sample-efficient than A3C on Breakout with the default parameters.

Add simpler examples

@ElliotWay Thank you. It is possible there has been some regression in ChainerRL. It should be investigated.

ChainerX support

~Links cannot be deepcopied after `to_device('native')`. We need to find a workaround or wait until it's fixed. https://github.com/chainer/chainer/issues/5916~ solved

ChainerX support

Async training requires https://github.com/chainer/chainer/issues/5931 to be fixed.

``` def deepcopy_link(link): device = link.device link.to_device(np) new_link = copy.deepcopy(link) link.to_device(device) new_link.to_device(device) return new_link ``` This can be a workaround to deepcopy.

ChainerX support

Current ChainerX does not support advanced indexing, which prevents from applying it to CategoricalDQN and IQN. https://github.com/chainer/chainer/issues/5944