SinGAN icon indicating copy to clipboard operation
SinGAN copied to clipboard

What is the meaning of the Effective Patch Size in Figure 4 in your paper?

Open Aluooooo opened this issue 5 years ago • 8 comments

What is the meaning of the Effective Patch Size in Figure 4 in your paper?

Aluooooo avatar Dec 02 '19 02:12 Aluooooo

In my view, as the authors state in the paper, they make the longer side of the coarsest image to be 25px, and actually all the discriminator's receptive field is 11px*11px, and then the receptive field is biggest in the coarsest image. When the image becomes finer, the effective patch size (the receptive field relative to the image size) becomes smaller.

xrenaa avatar Dec 02 '19 04:12 xrenaa

Thank you very much!

------------------ 原始邮件 ------------------ 发件人: "xrenaa"<[email protected]>; 发送时间: 2019年12月2日(星期一) 中午12:10 收件人: "tamarott/SinGAN"<[email protected]>; 抄送: "刘永洛"<[email protected]>;"Author"<[email protected]>; 主题: Re: [tamarott/SinGAN] What is the meaning of the Effective Patch Size in Figure 4 in your paper? (#44)

In my view, as the authors state in the paper, they make the longer side of the coarsest image to be 25px, and actually all the discriminator's receptive field is 11px*11px, and then the receptive field is biggest in the coarsest image. When the image becomes finer, the effective patch size (the receptive field relative to the image size) becomes smaller.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

Aluooooo avatar Dec 04 '19 05:12 Aluooooo

@xrenaa Hello, I have another question! So "patch" is not actually a cropped image from the original image (e.g. X_n in the paper), namely the part of the image? I should think patch as the 11x11 convolution kernel (which is from combined 5 convs)?

Thanks!

dianshan14 avatar Dec 08 '19 15:12 dianshan14

@dianshan14

Yes. The patch is actually what a patch discriminator can see.

xrenaa avatar Dec 09 '19 16:12 xrenaa

@xrenaa Thanks for your reply!

dianshan14 avatar Dec 10 '19 20:12 dianshan14

@xrenaa First, Thanks for your answers But I was confused about in the paper Figure 9 the author says "the effective receptive field at the coarsest level is smaller, allowing to capture only fine textures" However, according to Figure 4, the coarsest level is the biggest one. What I missed or I misunderstood something?

Thanks a lot.

xinhong-ho avatar Dec 13 '19 14:12 xinhong-ho

@xinhong-ho Actually I don't see this sentence. The caption of figure 9 is "A model with a small number of scales only captures textures. As the number of scales increases, SinGAN manages to capture larger structures as well as the global arrangement of objects in the scene." In my view, the finest figure is of the same size, which is 141*250. And when the scale (layer) of the model is small, the coarsest figure is big and then the effective receptive field is big and the model can not learn about the global information.

xrenaa avatar Dec 13 '19 16:12 xrenaa

Sorry, is at the bottom of Page 5. 1 I got the point! So when the number of scales is small => the effective receptive field is smaller than the image so it can't catch the global information. on the other hand, when is number is increased the receptive field is better to see the global information. Thanks!

xinhong-ho avatar Dec 14 '19 10:12 xinhong-ho