Zijun Deng

Results 18 comments of Zijun Deng

Actually, I did not use the pretrained model provided by jcjohnson. Instead, I use his code to convert the caffe vgg model. I am not sure whether it matters.

Yes, the VOC performance of FCN has been reproduced, and you can see the result in *voc-fcn (caffe vgg)*. The [official FCN code](https://github.com/shelhamer/fcn.berkeleyvision.org) do not provide training setting of cityscapes...

My implementation of SegNet does not use max pooling indices.

The label batch should be in shape (16, 512, 256) and the output batch should be in shape (16, 19, 512, 256). Check whether you loaded the wrong label image.

For this [issue](https://github.com/ZijunDeng/pytorch-semantic-segmentation/issues/4#issuecomment-327293745) you should check whether *torchvision.transforms.ToTensor* is used for the *label* image.

It seems that two feature maps mismatch in the "height" dimension. You can use " torch.nn.functional.upsample" to make sure that the two feature maps are in the same spatial size....

It takes about 4 hours (mush slower than before!!!) to train an epoch (batch size=2 on 1080Ti). The code on github is not latest and I will update it later.

At first I wanted to implement sliced prediction by just adding a decorator to the forward function of network. But during training my GPU cannot hold all the slices of...

Yes, that's a problem. The evaluation of mIOU has to be done at once on the whole val set. My computer has 32GB memory, which is enough.

Turn the mask into a 2-channel one (one channel for foreground class and another channel for background class). Or just use sigmod + BCELoss2D.