chainermn
chainermn copied to clipboard
ChainerMN: Scalable distributed deep learning with Chainer
In the current code, if a user does ``` $ pip install chainermn ``` it installs Chainer 4 and forces uninstalling newer Chainer (such as v5 and v6). It is...
You might know this already, recently I tried ChainerMN on [Sakura Koukaryoku Computing](https://www.sakura.ad.jp/koukaryoku/). I measured processing throughput by ImageNet example and compared [ChainerMN's train_imagenet.py](https://github.com/chainer/chainermn/tree/master/examples/imagenet/train_imagenet.py) to [Chainer's train_imagenet_data_parallel.py ](https://github.com/chainer/chainer/blob/master/examples/imagenet/train_imagenet_data_parallel.py) ``` //...
At least we know with FP16 model Communicator's `bcast_data` does not work. ```diff diff --git a/tests/chainermn_tests/communicator_tests/test_communicator.py b/tests/chainermn_tests/communicator_tests/test_communicator.py index a0ff350..f03fa5d 100644 --- a/tests/chainermn_tests/communicator_tests/test_communicator.py +++ b/tests/chainermn_tests/communicator_tests/test_communicator.py @@ -242,6 +242,12 @@ def test_communicator_cpu(param):...
Current `scatter_dataset` creates sub datasets of strictly equal lengths by duplicating some examples when necessary. This is for epoch triggers to work correctly. However, it is generally unnecessary for validator...
https://github.com/chainer/chainermn/blob/master/docs/source/reference/index.rst#communicators
ChainerMN has mostly-copied BatchNormalization code (but several AllReduce added), which means potential bugs from Chainer could also be imported. https://github.com/chainer/chainer/pull/4191 could be one of them; porting it to ChainerMN seems...
This issue is not inherent to chainermn, so I was confused where to submit it. In the [training example of ImageNet](https://github.com/chainer/chainermn/blob/master/examples/imagenet/train_imagenet.py), I cannot run the test without removing the `multiprocessing.set_start_method('forkserver')`...
This issue is the central place to discuss the future plans. Any suggestion and contribution are appreciated. We only discuss relatively large tasks here, and smaller tasks are managed in...