soft-intro-vae-pytorch icon indicating copy to clipboard operation
soft-intro-vae-pytorch copied to clipboard

Image quality deteriorates at final image resolution

Open JasperLinmans opened this issue 2 years ago • 1 comments

First off, amazing work on the soft-intro VAEs! Great results, and awesome job on sharing the code, including tutorials.

I'm reaching out for some assistance, as I've been able to reproduce high-quality images on the FFHQ dataset, but I'm struggling to achieve similar results with my own dataset of 256x256 histopathology images. The output seems to deteriorate significantly after the last resolution step. I'm wondering if you have any suggestions for improving the performance of the VAE on this type of dataset.

Here are some images mid training: sample_95_10

Here, well after the final image resolution step: sample_192_10

Training loss: losses

I've played around a bit with the KLD hyperparameters: BETA_KL in [0.05, 0.2, 0.4] and BETA_REC in [0.05, 0.1, 0.2, 0.4]. But this doesn't seem to help much. Also played with the learning rate in the final step (lowering it), similar results. Any suggestions to improve the performance?

JasperLinmans avatar Oct 07 '23 18:10 JasperLinmans

Thank you for your kind words! I assume you are using the "Style-based architecture" which is based on ALAE's architecture. Note that it was designed for structured datasets (e.g., faces) and works well for them. I'd try other architectures for your dataset or taking a look at the current architecture and perhaps changing the depth (number of layers) and the number of filters per layer (go up or down, see how it affects training). Good luck!

taldatech avatar Oct 08 '23 02:10 taldatech