ExplainingAI comments

Results 68 comments of


                                            ExplainingAI

How to improve the reconstruction of high-frequency details in the VQVAE training?

Got it. Regarding simple downscaling leading to loss of details, another thing you could try is instead of passing a downsampled version, pass normal (same size as original image) mask...

Question regarding the comparison with the DCGAN

Thank you so much for your support :) A regular DCGAN discriminator maps inputs of say shape 256x256 to single scalar output, so in scenarios where you need to feed...

Missing Scaling factor

Hello @sunly92 , Thank you :) You are right regarding the scaling factor not present, but this scaling is only used by the authors for VAE and not VQVAE. You...

Is there a way to configure this to train on Mel Spectrograms?

Hello @Aman-Khokhar18 , Can you let me know which specific part you are having trouble configuring. Ideally just updating the config with the right resolution and channels in https://github.com/explainingai-code/StableDiffusion-PyTorch/blob/main/config/celebhq.yaml#L3-L4 should...

Epoch for Unconditional ldm?

Hello @mdtayebadnan , For unconditional generation one should see decent face like outputs in 100 epochs with batch size of 16. While training for 200 epochs should further improve results...

Can this code use for image super-resolution or restoration?

Hello @wendeyy , I think you can use the code which does mask conditioned generation to perform super-resolution without requiring too many changes. So say you want to train a...

Can this code use for image super-resolution or restoration?

Yes, since this is latent diffusion model, we would need to train a VAE(but vae on celebhq should not require more than 4-5 epochs to get a decent result). I...

Can this code use for image super-resolution or restoration?

For autoencoder, there should be a folder created `vqvae_autoencoder_samples`(inside /mnt/StableDiffusion-PyTorch-main/celebhq), that would have reconstruction for images generated during training, just check the last image in that folder to see the...