hub icon indicating copy to clipboard operation
hub copied to clipboard

resize image using transforms in image segmentation examples

Open alvarobartt opened this issue 3 years ago • 2 comments

This is a proposal rather than an issue, as the Image Segmentation models in the Hub, which in this case are: Deeplabv3-ResNet101 and FCN-ResNet101, have a restriction on the input image dimensions as mentioned in the docs "H and W are expected to be at least 224px".

Then so as to ease the common example "copy-paste" the average user will do I think its a good idea to include the following transforms.Compose() rather than the current one, specifying that those lines are optional if the image height and width are above 224px; as the image segmentation will work almost the same way as the one presented in the example.

The following piece of code:

preprocess = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

would be replaced by:

preprocess = transforms.Compose([
    transforms.Resize(256), # Optional: Resize the input PIL Image to the given size.
    transforms.CenterCrop(224), # Optional: Crops the given PIL Image at the center.
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

As specified for example on the PyTorch Transfer Learning Beginner Tutorials

Note: don't know if it's in the scope of the example, but maybe is useful/worth to mention than reducing the image size will mean that the model will do the inference faster but the segmentation will be poorer as there are less pixels.

alvarobartt avatar Oct 07 '20 07:10 alvarobartt

BTW just let me know if you think that it's useful enough to be considered in the examples and I'll myself modify the code in the Markdown files (and maybe add some more lines to the explanation).

alvarobartt avatar Oct 07 '20 07:10 alvarobartt

Another note to take into consideration may be using the Image URL of the image presented in the posts Deeplabv3-ResNet101 and FCN-ResNet101, which is the image images/deeplab1.png rather than the images/dog.jpg used in the tutorial of the post.

Then the following piece of code:

# Download an example image from the pytorch website
import urllib
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

should be modified to:

# Download an example image from the pytorch/hub repository
import urllib
url, filename = ("https://github.com/pytorch/hub/raw/master/images/deeplab1.png", "deeplab1.png")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

Note: everything applies to both Image Segmentation models Deeplabv3-ResNet101 and FCN-ResNet101, as both entries use the "wrong" image. (?)

alvarobartt avatar Oct 07 '20 11:10 alvarobartt