stable-diffusion-reference-only icon indicating copy to clipboard operation
stable-diffusion-reference-only copied to clipboard

few questions.

Open yejy53 opened this issue 1 year ago • 3 comments

Thank you for your great work, but I have a few questions. 1. If I need to train on a new data set, how should I set the data set format? 2. Compared with the original controlnet, is it just a matter of replacing the original text encoder with image VAE input?

yejy53 avatar Jan 14 '24 08:01 yejy53

@yejy53 Training data can be constructed using huggingface datasets. Each sample should contain three data columns, blueprint (line drawing), image pomrpt, image (the image expected to be generated). The training part of the readme should be introduced. The reference image, that is, the image prompt, is encoded by VIT and used for cross attention, and VAE is not used. blueprint is injected into UNet through additional convolutional layers. The input of VAE has not been replaced and is still the image expected to be generated.

aihao2000 avatar Jan 14 '24 15:01 aihao2000

Thank you for your outstanding contributions. Could you kindly provide your email address? I have several specific inquiries that require your insight.

yejy53 avatar Jan 15 '24 02:01 yejy53

@yejy53 Of course, [email protected]

aihao2000 avatar Jan 15 '24 02:01 aihao2000