U-2-Net icon indicating copy to clipboard operation
U-2-Net copied to clipboard

High resolution human segmentation

Open zichengf1997 opened this issue 3 years ago • 6 comments

Thanks for your great work! I'm trying to train u2net for human segmentation, but the image size for further inference is 1920*2560. The given pretrained model does not perform well. Would you please give me some suggestions about the training strategy? (like training image selection or net architecture modification)

zichengf1997 avatar May 25 '21 13:05 zichengf1997

You can try cascade the two U2Nets(heavy + light) together. For the heavy U2Net, you can set the input resolution to 512x512. For the light U2Net (you can further reduce the filter numbers or layer numbers to build even smaller u2nets), you set the input resolution to a higher. These cascaded models probably works better than single stage model.

On Tue, May 25, 2021 at 5:21 PM zichengf1997 @.***> wrote:

Thanks for your great work! I'm trying to train u2net for human segmentation, but the image size for further inference is 1920*2560. The given pretrained model does not perform well. Would you please give me some suggestions about the training strategy? (like training image selection or net architecture modification)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/xuebinqin/U-2-Net/issues/209, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORJIZZNFIULR7EZEEMDTPOP43ANCNFSM45PLLOJQ .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

xuebinqin avatar May 25 '21 13:05 xuebinqin

You can try cascade the two U2Nets(heavy + light) together. For the heavy U2Net, you can set the input resolution to 512x512. For the light U2Net (you can further reduce the filter numbers or layer numbers to build even smaller u2nets), you set the input resolution to a higher. These cascaded models probably works better than single stage model. On Tue, May 25, 2021 at 5:21 PM zichengf1997 @.**> wrote: Thanks for your great work! I'm trying to train u2net for human segmentation, but the image size for further inference is 19202560. The given pretrained model does not perform well. Would you please give me some suggestions about the training strategy? (like training image selection or net architecture modification) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#209>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORJIZZNFIULR7EZEEMDTPOP43ANCNFSM45PLLOJQ . -- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

Thanks for your suggestion! Does the cascade point to using heavy u2net's output as light u2net's input?

zichengf1997 avatar May 26 '21 01:05 zichengf1997

yes, you can fix the heavy u2net and train the light one. But it depends on the results. If you are not satisfied with the edge accuracy, it probably works. There are different strategies you can try. I can't give specific details without seeing the failure cases.

On Wed, May 26, 2021 at 5:18 AM zichengf1997 @.***> wrote:

You can try cascade the two U2Nets(heavy + light) together. For the heavy U2Net, you can set the input resolution to 512x512. For the light U2Net (you can further reduce the filter numbers or layer numbers to build even smaller u2nets), you set the input resolution to a higher. These cascaded models probably works better than single stage model. … <#m_-1600261363481457098_> On Tue, May 25, 2021 at 5:21 PM zichengf1997 @.**> wrote: Thanks for your great work! I'm trying to train u2net for human segmentation, but the image size for further inference is 19202560. The given pretrained model does not perform well. Would you please give me some suggestions about the training strategy? (like training image selection or net architecture modification) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#209 https://github.com/xuebinqin/U-2-Net/issues/209>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORJIZZNFIULR7EZEEMDTPOP43ANCNFSM45PLLOJQ . -- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

Thanks for your suggestion! Does the cascade point to using heavy u2net's output as light u2net's input?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/xuebinqin/U-2-Net/issues/209#issuecomment-848385110, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORMA3V5DJXARNOXSD53TPRD4RANCNFSM45PLLOJQ .

-- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

xuebinqin avatar May 26 '21 07:05 xuebinqin

yes, you can fix the heavy u2net and train the light one. But it depends on the results. If you are not satisfied with the edge accuracy, it probably works. There are different strategies you can try. I can't give specific details without seeing the failure cases. On Wed, May 26, 2021 at 5:18 AM zichengf1997 @.> wrote: You can try cascade the two U2Nets(heavy + light) together. For the heavy U2Net, you can set the input resolution to 512x512. For the light U2Net (you can further reduce the filter numbers or layer numbers to build even smaller u2nets), you set the input resolution to a higher. These cascaded models probably works better than single stage model. … <#m_-1600261363481457098_> On Tue, May 25, 2021 at 5:21 PM zichengf1997 @.> wrote: Thanks for your great work! I'm trying to train u2net for human segmentation, but the image size for further inference is 1920*2560. The given pretrained model does not perform well. Would you please give me some suggestions about the training strategy? (like training image selection or net architecture modification) — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#209 <#209>>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORJIZZNFIULR7EZEEMDTPOP43ANCNFSM45PLLOJQ . -- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/ Thanks for your suggestion! Does the cascade point to using heavy u2net's output as light u2net's input? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#209 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSGORMA3V5DJXARNOXSD53TPRD4RANCNFSM45PLLOJQ . -- Xuebin Qin PhD Department of Computing Science University of Alberta, Edmonton, AB, Canada Homepage:https://webdocs.cs.ualberta.ca/~xuebin/

Thanks! I'll try several cascade models and give you corresponding results.

zichengf1997 avatar May 26 '21 09:05 zichengf1997

sounds like use the light one as a refine module, which just like the residual refine module, isn't it ?

CeciliaPYY avatar Jun 13 '21 08:06 CeciliaPYY

Wondering about the "fix the heavy u2net", cause the model/train script you offered, the input shape is 288, but when setting the input as 512, can one get good result by fix a model training with 288 input size but use it as 512?

CeciliaPYY avatar Jun 13 '21 13:06 CeciliaPYY