zero123
zero123 copied to clipboard
Could you please give me a brief explaination about the log images?
Hi author. We use main.py to train and obtain 5 log images(input, condition, reconstruction, samples, samples_cfg_scale_3.00) I know input and condition, but other 3 I dont understand. What is samples and samples_cfg_scale_3.00? How do I assess how good the training is by these two images?
I also have the same confusion....
The ''reconstruction'' is the output of the vae, which usually looks the same as the ''input'' as the vae is an autoencoder. The ''samples'' and ''samples_cfg_scale_3.00'' is the generated results under the guidance of ''condition'' and ‘’camera RT‘’, the differences between them is that the former one does not use unconditional guidance while the later one uses unconditional guidance and the guidance scale is 3.0. Ideally, the ''samples'' and ''samples_cfg_scale_3.00'' should be the same object as the ''input'', and are shown from another viewpoint different from the ''input''.
The ''reconstruction'' is the output of the vae, which usually looks the same as the ''input'' as the vae is an autoencoder. The ''samples'' and ''samples_cfg_scale_3.00'' is the generated results under the guidance of ''condition'' and ‘’camera RT‘’, the differences between them is that the former one does not use unconditional guidance while the later one uses unconditional guidance and the guidance scale is 3.0. Ideally, the ''samples'' and ''samples_cfg_scale_3.00'' should be the same object as the ''input'', and are shown from another viewpoint different from the ''input''.
Thank you for your kind answer!