lama icon indicating copy to clipboard operation
lama copied to clipboard

Mask "shadows" in some images?

Open eduardathome opened this issue 3 years ago • 15 comments

In some predicted images, there is a noticeable marking of the applied masks. I attached a small example here. This is part of a bigger image. https://imgur.com/a/19pIbQo I circled in red the "shadows" that I'm referring to. Those are exactly where the random masks were applied.

Worth mentioning: This is a model that was trained on my own data, using the proposed architecture and configuration proposed here: https://github.com/saic-mdal/lama/blob/main/configs/training/lama-regular.yaml

I am not certain what other information is relevant here but I will provide more if necessary.

Any suggestions on what the issue is here and what could fix it are welcomed. Thanks and thank you for the great project.

eduardathome avatar Jan 07 '22 08:01 eduardathome

Hi! Sorry for the late reply.

Does this kind of artifacts appear always? Or maybe there are specific conditions where they are more noticeable than usual?

Basically, you can try setting training_model.image_to_discriminator=inpainted - thus the discriminator and the losses will see a blended image gen_output * mask + input * (1 - mask) instead of raw gen_output.

windj007 avatar Jan 19 '22 10:01 windj007

How big is your dataset? What is your training resolution and testing resolution?

windj007 avatar Jan 19 '22 10:01 windj007

Hi, Sorry for the even later reply.

Does this kind of artifacts appear always? Or maybe there are specific conditions where they are more noticeable than usual?

It doesn't always appear: it is more present when the mask is over a solid color than on textures and meaningful content.

How big is your dataset? What is your training resolution and testing resolution?

The training set has 150000 images. The validation and test sets are 300 each. Resolution vary, with largest side being 1500px.

Basically, you can try setting training_model.image_to_discriminator=inpainted

I did one experiment so far, changing the config as you suggested, resuming the training on a trained model (for 100 epochs) and after 3, 10 and 30 epochs, the artifacts did not improve. I will further investigate and attempt a fresh training session with this blended image used for the discriminator loss.

Also, I was going to attempt adding 10000 generated images with random colors and gradients (and possibly similar jpeg encoding) to the next training session. Is this a naive approach ?

I also tested the proposed model big-lama and I encountered the same artifacts, using the proposed bin/predict.py script. I will attach 3 conclusive results. If you want to replicate this, let me know how you'd like me to attach the inputs. 1 8 11

eduardathome avatar Feb 03 '22 14:02 eduardathome

Hm... that's weird

In which resolution do you feed the images during training? Am I correct that you're applying Lama-Regular model to images of much higher resolution than during training?

I also tested the proposed model big-lama

Do you mean that a pre-trained (by us on Places) Big-Lama has same artifacts? Or you re-trained it on your data?

If you want to replicate this, let me know how you'd like me to attach the inputs.

Yes, a dozen of relevant images+masks would help a lot!

Thank you!

windj007 avatar Feb 03 '22 16:02 windj007

Hi again,

In which resolution do you feed the images during training? Am I correct that you're applying Lama-Regular model to images of much higher resolution than during training?

You are right, I resize images to 512x512 during training and yes, I use Lama-Regular configuration. During prediction I don't resize, using original size ( which varies but it is around 1500).

Your assumption is probably correct, since I tested the same prediction on 512x512 images and the artifacts, while not entirely gone, are significantly reduced.

Do you mean that a pre-trained (by us on Places) Big-Lama has same artifacts? Or you re-trained it on your data?

Yes, I get similar artifacts using the model proposed by you, more exactly: "The best model (Places2, Places Challenge):", downloaded from:

curl -L $(yadisk-direct https://disk.yandex.ru/d/ouP6l8VJ0HpMZg) -o big-lama.zip

Below I attached 12 images and mask that should replicate the artifacts. If you prefer them in a different format, let me know.

artifacts.zip

I will also follow up with 1-2 more general questions, so if possible, don't close the Issue yet.

Thanks for taking the time.

eduardathome avatar Feb 07 '22 15:02 eduardathome

Thank you! We'll check your images!

windj007 avatar Feb 16 '22 09:02 windj007

@windj007 I'm facing with the same problem after training 40 epoch on FFHQ dataset. However,the artifacts are much more obvious, any idea to alleviate the problem? https://s2.loli.net/2022/03/08/ZiBMduyKPJporVF.jpg

ImmortalSdm avatar Mar 08 '22 02:03 ImmortalSdm

@ImmortalSdm is this image from training dataloader or from validation? If this is validation, then such artifacts may apper when the mask is non-binary, e.g. it has smooth transition between 0 (known areas) and 1 (missing areas).

windj007 avatar Mar 09 '22 06:03 windj007

@eduardathome Sorry for the late reply again!

Have you tried training in 256?

Already after releasing the codebase we found that with Fourier-based generators training in 256 yields more robust performance than training in higher resolutions - that's most probably due to characteristics of the loss functions.

windj007 avatar Mar 09 '22 06:03 windj007

@ImmortalSdm is this image from training dataloader or from validation? If this is validation, then such artifacts may apper when the mask is non-binary, e.g. it has smooth transition between 0 (known areas) and 1 (missing areas).

@ImmortalSdm is this image from training dataloader or from validation? If this is validation, then such artifacts may apper when the mask is non-binary, e.g. it has smooth transition between 0 (known areas) and 1 (missing areas).

yep, it's from validation. Thanks for your reply, i will check my mask image.

ImmortalSdm avatar Mar 09 '22 10:03 ImmortalSdm

As anyone solve this issue, I am also facing this

Marcelo5444 avatar Mar 30 '22 16:03 Marcelo5444

Hi! I am facing some issues related with this. I am just fine tunning with LAMA. At first in order to do sannity check, I tried to just overfit to a single image of CelebHQ. When using the predict.py file, Everything works fine, but when training (overfitting) Saving the output of the network at different iterations I obtain this In Here you can see the following. Top left - original image, Top right image with mask on it. bottom left output imagen of the network. Bottom right inpainted image Input image size 512 with bs 1 (as I am overfitting to single image)

epoch=1_0_legend .

Marcelo5444 avatar Mar 30 '22 21:03 Marcelo5444

When fine-tuning a pretrained model, please keep in mind:

  1. Overfitting to a single image with discriminator might break because it is too easy for discriminator to remember the real image exactly - so it wins and the training diverges
  2. When we tried resuming training from a checkpoint, we encountered instabilities when batch size or number of gpus was changed after restart compared to those before. I do not know exactly why, but most probably due to batchnorm or adam stats.

windj007 avatar Apr 08 '22 08:04 windj007

@ImmortalSdm Hello! Did you manage to train lama on ffhq in the end? I am having issues and I would so greatly appreciate a trianed checkpoint if you managed.

Agrechka avatar Oct 18 '22 19:10 Agrechka

Hi again,

In which resolution do you feed the images during training? Am I correct that you're applying Lama-Regular model to images of much higher resolution than during training?

You are right, I resize images to 512x512 during training and yes, I use Lama-Regular configuration. During prediction I don't resize, using original size ( which varies but it is around 1500).

Your assumption is probably correct, since I tested the same prediction on 512x512 images and the artifacts, while not entirely gone, are significantly reduced.

Do you mean that a pre-trained (by us on Places) Big-Lama has same artifacts? Or you re-trained it on your data?

Yes, I get similar artifacts using the model proposed by you, more exactly: "The best model (Places2, Places Challenge):", downloaded from:

curl -L $(yadisk-direct https://disk.yandex.ru/d/ouP6l8VJ0HpMZg) -o big-lama.zip

Below I attached 12 images and mask that should replicate the artifacts. If you prefer them in a different format, let me know.

artifacts.zip

I will also follow up with 1-2 more general questions, so if possible, don't close the Issue yet.

Thanks for taking the time.

@eduardathome Have you solved the problem? I have the same issues as you. I would so greatly appreciate if you share your solution.

yftongbupt avatar Dec 07 '22 09:12 yftongbupt