ControlNet icon indicating copy to clipboard operation
ControlNet copied to clipboard

Colored Scribble input

Open lazy-nurd opened this issue 2 years ago • 5 comments

Hey, Thanks for your nice work. Scrilbble works really well for generation. But there are a lot of scenarios where we want the input image to contain colors and strokes and we want sd to process it. For example this image. image It has scribble and the color as well, and the model should take input the color and the the sketch as well, but current implementation does not consider the color and only consider sketches and hence input color is not being considered. image Example is above.

If you can train a model that takes the colored scribbled and produces results according to it would be perfect.

Thanks

lazy-nurd avatar Feb 15 '23 12:02 lazy-nurd

As a tip, you can use img2img with controlnet is more or less what you want. get the scribbles from the boundaries, and the colors from the img2img input. at denoising strength 0.8 or so color hints survive, but structure is lost in regular img2img, but the controlnet preserves the structure.

DiceOwl avatar Feb 15 '23 15:02 DiceOwl

@DiceOwl how would you go about using img2img with control net? At the moment, the base model that they use is text2img

CesarERamosMedina avatar Feb 16 '23 02:02 CesarERamosMedina

https://github.com/Mikubill/sd-webui-controlnet, an extension for Automatic1111, directly allows for img2img. More generally, stable diffusion is intrinsically img2img. text2img just replaces the input image by pure noise. So if you are the hacking kind, you could probably hack rudimentary img2img support into this repo with just a handful of lines of code. You just have to replace the pure noise with the right kind of mixture between noise and (diffused) input image, and adjust the denoising parameters to not start from pure noise.

DiceOwl avatar Feb 16 '23 09:02 DiceOwl

I'm also interested in this question, have you solved it please and look forward to your reply! @lazy-nurd

leoShen917 avatar Mar 13 '23 08:03 leoShen917

https://github.com/Mikubill/sd-webui-controlnet, an extension for Automatic1111, directly allows for img2img. More generally, stable diffusion is intrinsically img2img. text2img just replaces the input image by pure noise. So if you are the hacking kind, you could probably hack rudimentary img2img support into this repo with just a handful of lines of code. You just have to replace the pure noise with the right kind of mixture between noise and (diffused) input image, and adjust the denoising parameters to not start from pure noise.

When I try to do this during denoising, the resulting image tends to be more blurred, do you know why and how to solve it?

leoShen917 avatar Mar 13 '23 13:03 leoShen917