denoising-diffusion-pytorch
denoising-diffusion-pytorch copied to clipboard
Dumb Question about DM vs LDM
Do they only differ between the use of VAE to encode the inputs into embedding (and the conditional input part)? So if I wanted to make this in latent space, I'd use wrap this whole thing within the VAE?
@sztoo yea i'm trying to figure that out too, but i don't see why it should not work out of the box for VAE latent embedding space, as it is zero mean and unit variance
i'm currently looking at naturalspeech2 and i'm confused why it is done post-quantization as opposed to Rombach et al. latent diffusion paper, where they do it before..
it is here shortly explained. https://theaisummer.com/diffusion-models/