pytorch-segmentation-toolbox
pytorch-segmentation-toolbox copied to clipboard
The program is stuck.
(pytorch-0.41) <phd-1@kbkb541-server pytorch-segmentation-toolbox>$CUDA_VISIBLE_DEVICES=0,1,2,3 sh ./run_local.sh /media/phd-1/syz/OCNet/dataset/cityscapes Linux kbkb541-server 4.15.0-39-generic #42~16.04.1-Ubuntu SMP Wed Oct 24 17:09:54 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux 2018年 12月 04日 星期二 17:25:46 CST ResNet( (conv1): Conv2d(3, 64, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn1): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu1): ReLU() (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu2): ReLU() (conv3): Conv2d(64, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn3): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu3): ReLU() (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=True) (relu): ReLU() (layer1): Sequential( (0): Bottleneck( (conv1): Conv2d(128, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) (downsample): Sequential( (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) ) ) (1): Bottleneck( (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (2): Bottleneck( (conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): InPlaceABNSync(64, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) ) (layer2): Sequential( (0): Bottleneck( (conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False) (bn2): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) (downsample): Sequential( (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False) (1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) ) ) (1): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (2): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (3): Bottleneck( (conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (bn2): InPlaceABNSync(128, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) ) (layer3): Sequential( (0): Bottleneck( (conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) (downsample): Sequential( (0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) ) ) (1): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (2): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (3): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (4): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (5): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (6): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (7): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (8): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (9): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (10): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (11): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (12): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (13): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (14): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (15): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (16): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (17): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (18): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (19): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (20): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (21): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (22): Bottleneck( (conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(2, 2), dilation=(2, 2), bias=False) (bn2): InPlaceABNSync(256, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(1024, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) ) (layer4): Sequential( (0): Bottleneck( (conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(4, 4), dilation=(4, 4), bias=False) (bn2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(2048, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) (downsample): Sequential( (0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (1): InPlaceABNSync(2048, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) ) ) (1): Bottleneck( (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(4, 4), dilation=(4, 4), bias=False) (bn2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(2048, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) (2): Bottleneck( (conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(4, 4), dilation=(4, 4), bias=False) (bn2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False) (bn3): InPlaceABNSync(2048, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=none) (relu): ReLU() (relu_inplace): ReLU(inplace) ) ) (head): Sequential( (0): PSPModule( (stages): ModuleList( (0): Sequential( (0): AdaptiveAvgPool2d(output_size=(1, 1)) (1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01) ) (1): Sequential( (0): AdaptiveAvgPool2d(output_size=(2, 2)) (1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01) ) (2): Sequential( (0): AdaptiveAvgPool2d(output_size=(3, 3)) (1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01) ) (3): Sequential( (0): AdaptiveAvgPool2d(output_size=(6, 6)) (1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False) (2): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01) ) ) (bottleneck): Sequential( (0): Conv2d(4096, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False) (1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01) (2): Dropout2d(p=0.1) ) ) (1): Conv2d(512, 19, kernel_size=(1, 1), stride=(1, 1)) ) (dsn): Sequential( (0): Conv2d(1024, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)) (1): InPlaceABNSync(512, eps=1e-05, momentum=0.1, affine=True, devices=[0, 1, 2, 3], activation=leaky_relu slope=0.01) (2): Dropout2d(p=0.1) (3): Conv2d(512, 19, kernel_size=(1, 1), stride=(1, 1)) ) ) /home/phd-1/.conda/envs/pytorch-0.41/lib/python3.6/site-packages/torch/nn/functional.py:52: UserWarning: size_average and reduce args will be deprecated, please use reduction='elementwise_mean' instead. warnings.warn(warning.format(ret)) 321300 images are loaded!
it don't continue, why? I think may be because of the InPlaceABNSync. how I can slove it?
Hi, @suyanzhou626 I cannot find the problem from your information. Please make sure your data loader can access the images and labels firstly.
@suyanzhou626 Hi, I met the same problem with you, have you solved it ?
Same problem here but my program was stuck after the printing of iteration 1. No error printing. Running on 4X 12g GPUs (three XP, one V) with batch 4X2=8.
If I reduce the batch to 4X1=4 with default crop size 769, or default batch 4X2=8 with smaller crop size 761, the program goes well.
So it seems the memory problem but there is no out of memory error and just only stuck there. @speedinghzl Any thought? Thanks.
@lzrobots Yes, it is caused by out of memory. Is it TITAN V in the first position in your server? If so, you can change the order of GPU ids (e.g. 1,0,2,3) to solve this problem. Or you can run this repo with 761 and it does not affect the final result.
Yes solved. Thanks!
I am facing the same problem as @lzrobots. Tried BS=8 with INPUT_SIZE=[769, 769] or [761, 761], the simulation is stuck after iteration1. I've 4x12G 1080 Ti GPUs. With a smaller BS, say BS=4, the program runs well, but I'm afraid that it may affect the final performance. Any suggestions here @speedinghzl ?
[EDIT] Stuck with even lower input size, [713, 713]. Only lowering BS seems to help. Any workaround please?
1080Ti only has 11G memory, you can try to lower batch size. But I think it will affect the performance (~77% rather than ~78%).
Hi @speedinghzl , thanks for your swift response. Yes, the available memory is around 11G only. I think I can manage with that much performance difference, so I will proceed with a lower BS.
Thanks for your help!
Hi @speedinghzl ,
Just for your information, changing BS=4 while keeping everything as it was, I got a MIU of ~75.8%.
When you set BS=4, you should increase the iterations from 40K to 80K. Then you can increase the input size to take ~11G memory.
Hi @speedinghzl ,
Your suggestions make sense. I'll try them out now and update you with the outcome. Thanks for your help!
@speedinghzl Sorry, I have the same issue, my program is even stuck in 4xTitan Xp with the default settings.
Hi @speedinghzl,
I get a score of 76.22% changing STEPS to 80k from 40k (keeping BS=4, and everything else as it is)
I did not try with a larger input size, but it also seems an options worth trying since there is still some memory left that can be used. Thanks for your help!
Hi @d-li14 , Could you try running with BS=4 and STEPS=80k, and see if they solve your problem? You can see the reported performance numbers for your reference.
@aasharma90 Thanks for your kind advice! Shrinking the batch size can definitely fit the model into GPU memory with ease, but we have to sacrifice the performance as demonstrated in your experiments (even unable to reproduce the original DeepLab result, significantly lower than the reported 78.9%).
I am curious that as stated by the author, 4x12g VRAM will be enough to run the script successfully, but in my case, it seems not to work. So any helpful advice? @speedinghzl
Actually, 4x12g VRAM is not enough. I have run the run_local.sh
with 4 Tesla M40 GPUs, the memory usage of GPU 0 is over 12800M. Without modifying any default settings of this script, I got the final mIoU 77.4%.