ResNetFusion
ResNetFusion copied to clipboard
model problem in paper
Hi, I'm a little confused about the paper and code, there are five pooling layers in the network structure of the discriminator in Figure 3 of the article, yet I can't find the pooling layer in the code, is it a mapping problem or a code problem? If it is convenient and you wish to be contacted by email, this is my email: [email protected].
Can you tell me how you solved the problem at last?
Authors used the Conv with stride=2 to replace the Conv with stride=1 and the Pooling layer
Thank you!I want to ask you another question. In the paper, the generator uses 3×3 and 1×1 filters in the last layer. Why are all the 3×3 filters used in the code
You are welcome. I think the author made a mistake in writing and drawing, he may have tried the convolution kernel of 11 and 33 and chose the better ones, but forgot to change it in the article. I didn't check with the author about this, it was just my guess.
Ok,thank you!
Excuse me, did this problem happen during your training?
File "train.py", line 110, in
Sorry, I haven't trained the model, but that's probably the problem with the Torch version, and you can try looking it up on Google.
Hello,in the author's loss.py
,the return of tv_loss
is self.tv_loss_weight * 2 * (h_tv[:, :, :h_x - 1, :w_x - 1] + w_tv[:, :, :h_x - 1, :w_x - 1])
,which is different from the general TVloss's return -self.TVLoss_weight*2*(h_tv/ count_h + w_tv/count_w)/batch_size
. and I also don't know why it return the square of gradient value self.laplacian_filter(x) ** 2
in laplacain loss. if you know why the reason, could you please tell me the setting of these two return
? thank you so much.
Sorry,it took so long that I forgot the details of this paper. I remember that the square of the gradient image gave better training results and avoided values less than 0. The difference in TVloss may be related to the padding, but I'm not sure.
At 2022-05-13 21:06:14, "YannikYang1" @.***> wrote:
Hello,in the author's loss.py,the return of tv_loss is self.tv_loss_weight * 2 * (h_tv[:, :, :h_x - 1, :w_x - 1] + w_tv[:, :, :h_x - 1, :w_x - 1]) ,which is different from the general TVloss's return -self.TVLoss_weight2(h_tv/ count_h + w_tv/count_w)/batch_size. and I also don't know why it return the square of gradient value self.laplacian_filter(x) ** 2 in laplacain loss. if you know why the reason, could you please tell me the setting of these two return? thank you so much.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>
Sorry,it took so long that I forgot the details of this paper. I remember that the square of the gradient image gave better training results and avoided values less than 0. The difference in TVloss may be related to the padding, but I'm not sure. At 2022-05-13 21:06:14, "YannikYang1" @.> wrote: Hello,in the author's loss.py,the return of tv_loss is self.tv_loss_weight * 2 * (h_tv[:, :, :h_x - 1, :w_x - 1] + w_tv[:, :, :h_x - 1, :w_x - 1]) ,which is different from the general TVloss's return -self.TVLoss_weight2(h_tv/ count_h + w_tv/count_w)/batch_size. and I also don't know why it return the square of gradient value self.laplacian_filter(x) ** 2 in laplacain loss. if you know why the reason, could you please tell me the setting of these two return? thank you so much. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.**>
Thank you so much for your reply, it seems like I get the meaning of the square of gradient by your reply. and if you still remember the paper's code, could you recall the paper "Infrared and visible image fusion via detail preserving adversarial learning" ? I still have some questions about the paper's code.
In the paper , There is G(x,y) about the Target edge-enhancement loss, in the code, G(x,y) is represented by coefficient, coefficient = pyramid_addition * alpha/2 + 1
. But from the original paper, my understanding: pyramid_addition
is G(x,y), why multiply it by alpha/2 and + 1
? If you still have the impression, could you help me figure out the confusion? And it seems like there is no TVLoss in the original paper, do you know why it use tv_loss
in the code?
Thanks again for your reply, and I hope you all the best.
Sorry, I've been so busy lately that I don't have time to look at the code for this paper. I suggest that you can send a message to the author's mail, he kindly answered my questions. Hope you all the best too.
At 2022-05-14 22:13:51, "YannikYang1" @.***> wrote:
Sorry,it took so long that I forgot the details of this paper. I remember that the square of the gradient image gave better training results and avoided values less than 0. The difference in TVloss may be related to the padding, but I'm not sure. At 2022-05-13 21:06:14, "YannikYang1" @.> wrote: Hello,in the author's loss.py,the return of tv_loss is self.tv_loss_weight * 2 * (h_tv[:, :, :h_x - 1, :w_x - 1] + w_tv[:, :, :h_x - 1, :w_x - 1]) ,which is different from the general TVloss's return -self.TVLoss_weight2(h_tv/ count_h + w_tv/count_w)/batch_size. and I also don't know why it return the square of gradient value self.laplacian_filter(x) ** 2 in laplacain loss. if you know why the reason, could you please tell me the setting of these two return? thank you so much. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.**>
Thank you so much for your reply, it seems like I get the meaning of the square of gradient by your reply. and if you still remember the paper's code, could you recall the paper "Infrared and visible image fusion via detail preserving adversarial learning" ? I still have some questions about the paper's code. In the paper , There is G(x,y) about the Target edge-enhancement loss, in the code, G(x,y) is represented by coefficient, coefficient = pyramid_addition * alpha/2 + 1 . But from the original paper, my understanding: pyramid_addition is G(x,y), why multiply it by alpha/2 and + 1? If you still have the impression, could you help me figure out the confusion? And it seems like there is no TVLoss in the original paper, do you know why it use tv_loss in the code? Thanks again for your reply, and I hope you all the best.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: @.***>