Question about some techniques
Hi @pesser! Thank you for sharing the implementation of your wonderful work!
I have questions about some techniques. Would you tell me these questions?
I have used your pretrained celeba256 weight. The images were recorded using such as intermediates['x_inter'].append(img).
-
Why do you step the time? According to this line, it seems you choose time values for each
num_ddpm_timesteps // num_ddim_timesteps. Actually, I have never seen this technique. -
If I do not step the above values and T: 1000 -> 0, i.e. the time steps are continuous and have ranged from 1 to 1000, I cannot get clear results. This image was recorded in six separate 1000 iterations. This image was recorded using x_inter.

This image was recorded using pred_x0.

-
If the time step is fixed to default values in this line and the start time is decreased such as to 800, I cannot clear results. Why? Your method cannot work well other than t=1000? (actually, if t=1000, 50 iterations because time steps are split)

-
If your method cannot perform the question (3), your method cannot perform this unique denoising technique as shown in Sohl-Dickstein+ ICML15? Can you possibly accomplish it?

Best regards, Udon