ControlNet icon indicating copy to clipboard operation
ControlNet copied to clipboard

How to do promptless training for ControlNet? is there any script for that?

Open KhawlahB opened this issue 1 year ago • 14 comments

I need to do promptless training, how can I do it?

Is there any direct script for training promptless ControlNet ?

KhawlahB avatar Dec 09 '23 00:12 KhawlahB

just set the captions to empty string

all duplicates about "dropping prompts" https://github.com/lllyasviel/ControlNet/issues/93 https://github.com/lllyasviel/ControlNet/issues/160 https://github.com/lllyasviel/ControlNet/issues/246 https://github.com/lllyasviel/ControlNet/issues/422 https://github.com/lllyasviel/ControlNet/issues/506

geroldmeisinger avatar Dec 09 '23 07:12 geroldmeisinger

just set the captions to empty string

all duplicates about "dropping prompts" #93 #160 #246 #422 #506

I already had this idea in my mind (using empty string) but it is not look good, that's why i am asking if there is a direct script for promptless image2image generation using ControlNet... so, i can do promptless training directly

@geroldmeisinger

KhawlahB avatar Dec 10 '23 00:12 KhawlahB

you don't want to drop all prompts but just a certain percentage otherwise the CN becomes meaningless. something like rand() > 0.5 ? "" : caption . but to give a better answer we would need more details. what is the concept of your CN? how does it perform without prompt dropping?

geroldmeisinger avatar Dec 10 '23 08:12 geroldmeisinger

you don't want to drop all prompts but just a certain percentage otherwise the CN becomes meaningless. something like rand() > 0.5 ? "" : caption . but to give a better answer we would need more details. what is the concept of your CN? how does it perform without prompt dropping?

The concept for my ControlNet is image-to-image translation, i want to feed the model an image from domain A and i want it to generate its corresponding image in domain B...

so, i do not need the prompt. I have tried to use empty string and meaningless words but i got generated image with random textures! it mixed things...

Can you please clarify how to drop the prompts? @geroldmeisinger

KhawlahB avatar Dec 11 '23 03:12 KhawlahB

can post some images please. the problem is unlikely due to the prompts. prompts dropping can just subtly increase the quality but not solve a broken concept.

geroldmeisinger avatar Dec 11 '23 10:12 geroldmeisinger

it is similar to this...

Screenshot 2023-12-13 011044

it is image-to-image translation. So, no need for prompt. The input and output are images and the condition will be an image as well to control the generation.

Does the prompt dropping gonna help me in my idea? @geroldmeisinger

KhawlahB avatar Dec 12 '23 22:12 KhawlahB

  1. if you want to train multiple concepts in your CN you need a lot more data (your image shows segments, greyscale2color, lines to image all in one etc.)
  2. imagine a line drawing of circle. what is it supposed to generate from that without a prompt? a ball, an orange, a planet, a tire etc. unless you stay in the same domain (like images of street scenes) you have to guide it somehow. but for this you need to provide more details.

geroldmeisinger avatar Dec 12 '23 22:12 geroldmeisinger

  1. it is one concept only... i showed you different concepts just to clarify my point ( i do not need prompt)... to be more specific... my idea is similar to this one aerial to map. The input will be the aerial image and its corresponding map -> in the training.. In inference the input will be the aerial image the generated image should be a map for this aerial image...
Screenshot 2023-12-13 011044

So, the guidance will be conditional image

  1. what will happen if i dropped 100% of captions?

I hope it is clear now..

@geroldmeisinger

KhawlahB avatar Dec 12 '23 22:12 KhawlahB

what will happen if i dropped 100% of captions?

I don't know, I never tried myself. Although it feels strange to not use any prompt at all on a diffusion model which requires prompts. You could also try to use the same prompt for all "an aerial photograph". Maybe it's better to create your own promptlesss diffusion model just for aerial photographs. Another thing you could look into is sd_unlocked=true.

geroldmeisinger avatar Dec 13 '23 08:12 geroldmeisinger

I will give it a try. But what does (sd_unlocked=true) mean? how does it gonna help me in promptless training? clarify it please.

@geroldmeisinger

KhawlahB avatar Dec 14 '23 00:12 KhawlahB

as far as I understand it trains "deeper" into Stable Diffusion which could be helpful here if you're working in the same domain

geroldmeisinger avatar Dec 14 '23 10:12 geroldmeisinger

@KhawlahB I am trying similar things, how about your results? Relly looking forward to your response:)

remember00000 avatar Apr 05 '24 05:04 remember00000

@KhawlahB I am trying similar things, how about your results? Relly looking forward to your response:) while compute attention in the unet, the prompt embedding will be the key and value, so u can imagine what happened if key and value are always the same..

CuddleSabe avatar May 11 '24 02:05 CuddleSabe