PaddleSeg icon indicating copy to clipboard operation
PaddleSeg copied to clipboard

[General Issue][Matting] Performance of MODNet is poor when trained on 1024x1024

Open hackkhai opened this issue 2 years ago • 8 comments

Thanks for your issue. To help us solve the issue better, please provide following information:

  1. PaddleSeg version: 2.6
  2. PaddlePaddle version:2.3
  3. Operation system: Linux
  4. Python version:Python3.7
  5. CUDA/cuDNN version: CUDA11.2 cuDNN 8
  6. Additional context: Experiments: I tried all combinations givent below:
  1. PPMatting, HumanMatting, MODNet
  2. Resolution: 512,1024
  3. Resizing techniques: ResizeByShort, LimintShort
  4. Different Backbones: Mobilenetv2, HRNetw18,HRNetw48 Observations:
  1. Out of all MODNet with 512x512 resolution did better in identifying the salient object (but not great), Though PPMatting was good it was not close to MODNet even in higher resolution
  2. All models performs Bad on higher resolition training and for MODNet its bad than 512x512 training.
  3. For human images, though the detail module is good, the semantic module is performing bad (I am asuming this situation, based on what i observed) which cuts the legs or hands of the human
  4. When trained on real life images it does poorly on studio quality images and decently on real life images. But if its trained on studio quality images it does bad on real life images
  5. The model performance degrades when we add more salient objects. for example trained food, clothing, shoes, bags together makes the model performance pretty average Questions:
  1. How to make the model do better on 1024 resolution
  2. What am i doing wrong in PPmatting, after reading the paper i thought it would do better than MODNet for higher resolution images.
  3. How to make the model do matting for multiple objects. Like personal care products, cars, packaged goods etc
  4. how to make the model robust for studio quality images and real life images
  5. Any suggestions or experiments that i could try?

hackkhai avatar Jul 21 '22 10:07 hackkhai

  1. In 1024 resolution, ppmatting is better than modnet.
  2. Maybe you need a dataset firstly.
  3. Maybe you can train the model by combine the studio quality images and real life images.
  4. When you training, you can try more data augmentation according different scenarios

wuyefeilin avatar Jul 21 '22 11:07 wuyefeilin

Thanks for the reply, I do have a dataset of 80k anotated images.How do i fix the multiple class object matting?

hackkhai avatar Jul 21 '22 12:07 hackkhai

Thanks for the reply, I do have a dataset of 80k anotated images.How do i fix the multiple class object matting? There's too much data to know the distribution difficultly. We need to experiment before giving you suggestion. If you don not mind, could you share your dataset with us? So I can have some experiment with it.

My email is [email protected].

wuyefeilin avatar Jul 22 '22 01:07 wuyefeilin

Hey, sent you a mail, please check

hackkhai avatar Jul 25 '22 03:07 hackkhai

Hey, you havent checked my mail yet :) please do check it @wuyefeilin

hackkhai avatar Aug 09 '22 08:08 hackkhai

@wuyefeilin its almost been a month please check the mail

hackkhai avatar Aug 17 '22 14:08 hackkhai

Sorry, but I Do not receive your mail. If you don't mind, please resent it.

wuyefeilin avatar Aug 23 '22 03:08 wuyefeilin

ca you share anyother mail id? maybe baidu mail id is not working for me

hackkhai avatar Aug 23 '22 10:08 hackkhai

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Dec 10 '22 17:12 github-actions[bot]