ExplainingAI
ExplainingAI
Got it. Regarding simple downscaling leading to loss of details, another thing you could try is instead of passing a downsampled version, pass normal (same size as original image) mask...
Thank you so much for your support :) A regular DCGAN discriminator maps inputs of say shape 256x256 to single scalar output, so in scenarios where you need to feed...
Hello @sunly92 , Thank you :) You are right regarding the scaling factor not present, but this scaling is only used by the authors for VAE and not VQVAE. You...
Hello @Aman-Khokhar18 , Can you let me know which specific part you are having trouble configuring. Ideally just updating the config with the right resolution and channels in https://github.com/explainingai-code/StableDiffusion-PyTorch/blob/main/config/celebhq.yaml#L3-L4 should...
Hello @mdtayebadnan , For unconditional generation one should see decent face like outputs in 100 epochs with batch size of 16. While training for 200 epochs should further improve results...
Hello @wendeyy , I think you can use the code which does mask conditioned generation to perform super-resolution without requiring too many changes. So say you want to train a...
Yes, since this is latent diffusion model, we would need to train a VAE(but vae on celebhq should not require more than 4-5 epochs to get a decent result). I...
For autoencoder, there should be a folder created `vqvae_autoencoder_samples`(inside /mnt/StableDiffusion-PyTorch-main/celebhq), that would have reconstruction for images generated during training, just check the last image in that folder to see the...