zero123 icon indicating copy to clipboard operation
zero123 copied to clipboard

Could you please give me a brief explaination about the log images?

Open yumeko717 opened this issue 1 year ago • 3 comments

Hi author. We use main.py to train and obtain 5 log images(input, condition, reconstruction, samples, samples_cfg_scale_3.00) I know input and condition, but other 3 I dont understand. What is samples and samples_cfg_scale_3.00? How do I assess how good the training is by these two images?

yumeko717 avatar Aug 16 '23 08:08 yumeko717

I also have the same confusion....

ys830 avatar Aug 24 '23 02:08 ys830

The ''reconstruction'' is the output of the vae, which usually looks the same as the ''input'' as the vae is an autoencoder. The ''samples'' and ''samples_cfg_scale_3.00'' is the generated results under the guidance of ''condition'' and ‘’camera RT‘’, the differences between them is that the former one does not use unconditional guidance while the later one uses unconditional guidance and the guidance scale is 3.0. Ideally, the ''samples'' and ''samples_cfg_scale_3.00'' should be the same object as the ''input'', and are shown from another viewpoint different from the ''input''.

yanjk3 avatar Sep 27 '23 02:09 yanjk3

The ''reconstruction'' is the output of the vae, which usually looks the same as the ''input'' as the vae is an autoencoder. The ''samples'' and ''samples_cfg_scale_3.00'' is the generated results under the guidance of ''condition'' and ‘’camera RT‘’, the differences between them is that the former one does not use unconditional guidance while the later one uses unconditional guidance and the guidance scale is 3.0. Ideally, the ''samples'' and ''samples_cfg_scale_3.00'' should be the same object as the ''input'', and are shown from another viewpoint different from the ''input''.

Thank you for your kind answer!

yumeko717 avatar Oct 18 '23 12:10 yumeko717