No Variance Scaling of Latent Space for Kl-Regulerized Autoencoder

Open maltesilber opened this issue 1 year ago • 0 comments

In the paper it states that the latent space get scaled based on the variance of the first batch. See: scaling1 scaling2

However this behaviour does not seem to be implemented, as this is the forward pass used in the training/validation step: https://github.com/CompVis/latent-diffusion/blob/a506df5756472e2ebaf9078affdde2c4f1502cd4/ldm/models/autoencoder.py#L335-L342

When trying to fine-tune my own model I released this as the only loss that worsens is the KL-Loss. In the end I got a model with extremely high variance in the latent space. kl_loss

Is my understanding correct that the scaling logic is missing from the AutoEncoder?

Jun 27 '24 14:06 maltesilber