latent-diffusion icon indicating copy to clipboard operation
latent-diffusion copied to clipboard

Wired loss and reconstruction results when training the autoencoder

Open OwalnutO opened this issue 2 years ago • 4 comments

I'm trying to train the AE on my own dataset (~10w) with the default config file autoencoder_kl_32x32x4.yaml. I only decrease the learning rate from 4.5e-6 to 1e-6 since I use a smaller batchsize. However, the training losses are wired and the reconstructiuon results are unsatisfactory. Could anyone give some suggestions? Thanks in advance! ![3](https://github.com/CompVis/l 2 2 1 Recon results4 GT image5

OwalnutO avatar Aug 24 '23 02:08 OwalnutO

I think it might be that kl_weight is too small, you can try increasing kl_weight in the config

illrayy avatar Aug 28 '23 15:08 illrayy

@OwalnutO Hi, I encounter the same problem, have you solved it by increasing kl_weight? thank you!

houqingying avatar Oct 12 '23 03:10 houqingying

You may find answers in here #187 . By the way, why does the reconstruction have a different size to the GT image?

GuHuangAI avatar Oct 24 '23 14:10 GuHuangAI