diffae icon indicating copy to clipboard operation
diffae copied to clipboard

Questions on the semantic codes

Open moegi161 opened this issue 6 months ago • 0 comments

Thank you for your amazing work! I am trying to use your work on an editing task and I encounter some questions after reading the paper.

  1. In my understanding, z_sem is simply a latent space from an autoencoder and thus not from a generative model, like those autoencoders in latent diffusion or CLIP models. Is it correct?
  2. If so, why can we interpolate or manipulate z_sem directly? I can understand if we manipulate the sampled noise in latent ddim, but I do not understand why we can do this in the semantic space Z. Is there any explicit design? Or it is due to the domain-specific dataset or the normalization of z_sem?

I wonder if it works better by mapping the z_sem to some generative latent spaces and do the manipulation on those spaces.

moegi161 avatar Aug 14 '24 08:08 moegi161