AnimateAnyone
AnimateAnyone copied to clipboard
Some Questions About the ReferenceNet
The ReferenceNet takes as input the VAE encode image. Does it add noise to it?
If you are not adding noise to the ReferenceNet image latents, do you call the ReferenceNet U-Net multiple times with the same timesteps as the denoising network, or are just calling it with a single timestep?