taesd Finetuning taesd but get dim and less saturated images

Finetuning taesd but get dim and less saturated images

Open stardusts-hj opened this issue 7 months ago • 3 comments

Hi madebyollin!

Thank you so much for your awesome work and kind reply. Sorry to bother you again. While trying to finetune taesd decoder, I found that the images are getting a little dim and less saturated. I suspect the reason may be the different datasets or data processing strategies I've used. The example is shown below.

For datasets, I use the same laion ae dataset which I also find the images are less saturated since lots of images have a white background and a single object in the center. The images are resized to 512x512. I adopt the color augmentation you suggested, but the output is still less staturated than taesd. It would be kind if you could give some suggestions for dataset choice or data augmentation, will it help if I add more colorful datasets like [danbooru2021]?
Besides, my training strategy is to train taesd decoder with lpips loss and gan loss. Do you think which one matters more to enhance the output quality? Will the lpips loss affect the saturation of the images? If so, how about using a GAN loss only ?
It seems that taesd version 1.2 is trained based on the weights from the previous version. Could you please share some details about the finetuning (did you initialize discriminator from previous version? why did you remove the lpips loss in version 1.2?)

I sincerely appreciate your wonderful work and enthusiasm. It will be really great if you could provide me some suggestions for finetuning taesd. Many thanks to you.

(left taesd, right my finetuning version)

Jul 03 '24 03:07 stardusts-hj

taesd taesd copied to clipboard

Finetuning taesd but get dim and less saturated images

taesd
taesd copied to clipboard