diffae
diffae copied to clipboard
Questions on the semantic codes
Thank you for your amazing work! I am trying to use your work on an editing task and I encounter some questions after reading the paper.
- In my understanding, z_sem is simply a latent space from an autoencoder and thus not from a generative model, like those autoencoders in latent diffusion or CLIP models. Is it correct?
- If so, why can we interpolate or manipulate z_sem directly? I can understand if we manipulate the sampled noise in latent ddim, but I do not understand why we can do this in the semantic space Z. Is there any explicit design? Or it is due to the domain-specific dataset or the normalization of z_sem?
I wonder if it works better by mapping the z_sem to some generative latent spaces and do the manipulation on those spaces.