diffae
diffae copied to clipboard
Issues with Conditional Sampling
Hi,
Consider the following lines of code:
cond1 = model.encode(batch) xT = model.encode_stochastic(batch, cond1, T=50) pred = model.render(noise= xT , cond=cond1, T=20) #xT_rand = torch.rand(xT.shape, device=device) #pred_rand = model.render(noise= xT_rand , cond=cond1, T=20)
The above autoencoding works perfectly as expected. However, instead of using xT, if I use xT_rand with the same cond1, I get nothing but noise in the predicted image. Could you please help me understand why that happens? As mentioned in the paper, most of the semantic information is captured in z_sem, so why does it fails in this case?
Your response will be greatly appreciated.
Thank you!
torch.rand is a uniform random which is not what the diffusion model trained for. Please use torch.randn.
Hi, thank you for your quick response. Despite using torch.randn, I get distorted output. Here's an example:
(input - noise - prediction)
And this happens for all the examples I tested, not just this one. Do you have any insights into why this is happening?
Thanks again!
I'm not sure what's the usecase here. Can you tell me what's the big picture? This doesn't seem like the usecase mentioned in the paper.