diffae
diffae copied to clipboard
How is the autoencoding happening in the code during inference?
I understand that autoencoding is being performed on a sampled batch of images. I am, however, unsure where those original images are being used to condition the generation. In other words, I was expecting cond = encoder(x_start) to be used as a condition to the DDIM, but I don't see it happening, and yet somehow, the generated images are reconstructions of a sampled real image batch.
I am running the run_ffhq128.py (just the first train). I see that cond = None is being passed while sampling. I went deeper into the code and I thought atleast the _xstart would be used as coniditioning, but it is being passed as **kwargs in the p_mean_variance and ultimately not used. Am I missing something?