U-2-Net icon indicating copy to clipboard operation
U-2-Net copied to clipboard

How to finetune the refine part if I want to cascade two model together?

Open CeciliaPYY opened this issue 2 years ago • 2 comments

Hello, xuebinqin, I got some tough problem when using u2net alone, which is just like the belowing, the left if pred from u2net with input resolution of 768, and the right is the GT. 32A9EB3C214C65F244A8FF0D06837B8D As someone asked u before in the issues, you suggest cascade of u2nets together for high resolution input and fix the heavy one to train the light one, I follow the process which is somewhat like what u do in BASNet, but find the model don't converge at all, can u help me out to find why~~~ WechatIMG1707

where loss0 and loss is the same, and is the refine loss of second stage Looking forward to your reply!!

CeciliaPYY avatar Jul 18 '21 02:07 CeciliaPYY

you can try test time augmentation to get more stable results. As for the convergence, we can't give exact reasons without debugging your code. You may have to double check your code. There shouldn't be issues on convergence if all your implementation's are correct.

On Sun, Jul 18, 2021 at 6:33 AM Cecilia @.***> wrote:

Hello, xuebinqin, I got some tough problem when using u2net alone, which is just like the belowing, the left if pred from u2net with input resolution of 768, and the right is the GT. [image: 32A9EB3C214C65F244A8FF0D06837B8D] https://user-images.githubusercontent.com/30582437/126053775-09b37781-00c5-4d6f-baa8-547eb1854358.jpg As someone asked u before in the issues, you suggest cascade of u2nets together for high resolution input and fix the heavy one to train the light one, I follow the process which is somewhat like what u do in BASNet, but find the model don't converge at all, can u help me out to find why~~~ [image: WechatIMG1707] https://user-images.githubusercontent.com/30582437/126053840-7ca7ce36-aa50-4ff6-86f1-730763e02fc2.png

where loss0 and loss is the same, and is the refine loss of second stage Looking forward to your reply!!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/xuebinqin/U-2-Net/issues/231, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORNLXT7HUP65H46OMUDTYI4Q3ANCNFSM5ARVB7SA .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

xuebinqin avatar Jul 19 '21 14:07 xuebinqin

you can try test time augmentation to get more stable results. As for the convergence, we can't give exact reasons without debugging your code. You may have to double check your code. There shouldn't be issues on convergence if all your implementation's are correct. On Sun, Jul 18, 2021 at 6:33 AM Cecilia @.***> wrote: Hello, xuebinqin, I got some tough problem when using u2net alone, which is just like the belowing, the left if pred from u2net with input resolution of 768, and the right is the GT. [image: 32A9EB3C214C65F244A8FF0D06837B8D] https://user-images.githubusercontent.com/30582437/126053775-09b37781-00c5-4d6f-baa8-547eb1854358.jpg As someone asked u before in the issues, you suggest cascade of u2nets together for high resolution input and fix the heavy one to train the light one, I follow the process which is somewhat like what u do in BASNet, but find the model don't converge at all, can u help me out to find why~~~ [image: WechatIMG1707] https://user-images.githubusercontent.com/30582437/126053840-7ca7ce36-aa50-4ff6-86f1-730763e02fc2.png where loss0 and loss is the same, and is the refine loss of second stage Looking forward to your reply!! — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#231>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORNLXT7HUP65H46OMUDTYI4Q3ANCNFSM5ARVB7SA . -- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

I'm wondering if it would work to just finetune the second part with the first part fixing and the input is just only the coarse mask from the first stage, or should we add the image also just like what really did in DIM(Deep Image Matting) or just train the whole network together just like what you did in BASNet.

CeciliaPYY avatar Jul 20 '21 03:07 CeciliaPYY