pytorch-deform-conv-v2 icon indicating copy to clipboard operation
pytorch-deform-conv-v2 copied to clipboard

About the learning rate setting of p_conv and m_conv

Open dontLoveBugs opened this issue 5 years ago • 7 comments

You set the gradient of p_conv and m_conv to 0.1 times the other layers, but I find the gradient has no change after backward. I use the following code to test.

    def _set_lr(module, grad_input, grad_output):
        print('grad input:', grad_input)
        print('grad output:', grad_output)
        grad_input = (grad_input[i] * 0.1 for i in range(len(grad_input)))
        grad_output = (grad_output[i] * 0.1 for i in range(len(grad_output)))
    x = torch.randn(4, 3, 5, 5)
    y_ = torch.randn(4, 1, 5, 5)
    loss = nn.L1Loss()

    d_conv = DeformConv2d(inc=3, outc=1, modulation=True)

    y = d_conv.forward(x)
    l = loss(y, y_)
    l.backward()

    print('p conv grad:')
    print(d_conv.p_conv.weight.grad)
    print('m conv grad:')
    print(d_conv.m_conv.weight.grad)
    print('conv grad:')
    print(d_conv.conv.weight.grad)

The gradient of p_conv is same with the grad_input, but I think the gradient of p_conv is 0.1 times the gradient of the grad_input. Am I wrong? image image

dontLoveBugs avatar Feb 21 '19 04:02 dontLoveBugs

You're right! I'll fix it.

4uiiurz1 avatar Apr 19 '19 09:04 4uiiurz1

You're right! I'll fix it.

Have you solved this problem now?

BananaLv26 avatar Jul 24 '19 11:07 BananaLv26

@dontLoveBugs Hello, can you review my issue ? I think the bilinear kernel is wrong

jszgz avatar May 28 '20 13:05 jszgz

You're right! I'll fix it.

'tuple' object can not be modified. Your code just get an generator.

zcong17huang avatar Sep 22 '20 10:09 zcong17huang

I have searched online, the grad of output can not be modified, if you want modify the grad of input, you need to return the modified grad of input , like : def _set_lr(module, grad_input, grad_output): return (grad_input[i] * 0.1 for i in range(len(grad_input)))

you can try it. My question is : Why change the p_conv gradients, Is it to avoid affecting the learning of another feature extraction branch?

XinZhangRadar avatar Dec 10 '20 14:12 XinZhangRadar

@XinZhangNLPR the you is becuse the backforward_hook expected tuple, not 'generator'

I have searched online, the grad of output can not be modified, if you want modify the grad of input, you need to return the modified grad of input , like : def _set_lr(module, grad_input, grad_output): return (grad_input[i] * 0.1 for i in range(len(grad_input)))

you can try it. My question is : Why change the p_conv gradients, Is it to avoid affecting the learning of another feature extraction branch?

Your suggestion still return a generator not a tuple

steven22tom avatar Dec 18 '20 07:12 steven22tom

You're right! I'll fix it.

it seems this bug has not fixed yet

YXB-NKU avatar Oct 03 '23 03:10 YXB-NKU