zero123
zero123 copied to clipboard
confusion about conditioning embedding
hello,
I have a question about conditioning embedding that is input into to diffusion model(here are your codes):
clip_emb = self.model.cc_projection(torch.cat([self.clip_emb, T[None, None, :]], dim=-1))
cond['c_crossattn'] = [torch.cat([torch.zeros_like(clip_emb).to(self.device), clip_emb], dim=0)]
cond['c_concat'] = [torch.cat([torch.zeros_like(self.vae_emb).to(self.device), self.vae_emb], dim=0)]
Why do you concatenate zero_like(clip_emb) and itself as a condition? Is it only because of the tensor size or something else?