fast-neural-style icon indicating copy to clipboard operation
fast-neural-style copied to clipboard

What does the style_image_size do

Open ghost opened this issue 9 years ago • 4 comments

What does exactly the style_image_size parameter do? the default is 256. Does it means first scale the image down to 256x256 and then fit to VGG network to compute the gama matrix?

ghost avatar Nov 05 '16 00:11 ghost

Yes, the style image is scaled but not to 256x256 but keeping the aspect ratio so that the smaller side is 256 pixels. This among many other things is explained in https://github.com/jcjohnson/fast-neural-style/blob/master/doc/flags.md

htoyryla avatar Nov 05 '16 06:11 htoyryla

@deeprun the VGG module is not the whole VGG16, which is removed fc layers, so the you can input any resolution to the network.

austingg avatar Nov 11 '16 09:11 austingg

@austingg, I don't think the question was related to the FC layers only working with 256x256 but to the parameter style_image_size, which does indeed scale the longer side of the style image to whatever the parameter specifies. The content image can still be any size as long as there is enough memory.

In the original neural-style, there was a style scale parameter which worked in the same way, scaled the style image during pre-processing.

By the way, I modified the original neural-style so that one could also use the FC layers in the optimization regardless of the image size. I have written about it in three blog posts starting from http://liipetti.net/erratic/2016/03/28/controlling-image-content-with-fc-layers/

htoyryla avatar Nov 11 '16 09:11 htoyryla

I wonder in super resolution task when perceptual loss is computed input image is resized to image size that original VGG network is expecting as input (regardless that we use fully convolutional part of VGG) or input can have arbitrary size and resize is not needed?

mrgloom avatar Jul 29 '19 11:07 mrgloom