sd-webui-controlnet icon indicating copy to clipboard operation
sd-webui-controlnet copied to clipboard

Controlling CN from within another script?

Open scruffynerf opened this issue 2 years ago • 12 comments

I have a working api based solution for this, but trying to rewrite it into a script to run from within a1111, and realizing that it's not obvious how to do so, so asking to be sure I'm not missing something.

My standalone script using api (shoutout to @mix1009 for sdwebuiapi) : given image pile, take each in turn and use as image with model, and given prompt in txt2img. It works, and I get back each new image, using CN, which was fed the image and the model desired, and using the prompt.

Trying to do the same inside A1111 using a built in script I'm writing, I have working scripts that do everything else I want (modify parameters of txt2img etc...), but the parameters of Controlnet are only passed via p.script_args, yes? So if I want to change the image used with a model, or even the model used, (assuming Controlnet is otherwise active and configured as desired), the 'correct' way is to modify p.script_args, change the values there, and generate an image [aka process_images(p)]?

I'll do that, I just wanted to be sure that a better way to do this doesn't exist (yet)?

scruffynerf avatar Mar 01 '23 21:03 scruffynerf

That's how the /controlnet/*2img API routes are currently implemented to convert requests into controlnet settings at the moment. You can use the p.control_net_image, p.control_net_module, etc. properties and the extension should work with these values.

To update the script_args related to controlnet script, you should be able to write something like this:

def update_cn_script_args(p, new_args):
    cn_script_args_len = 0
    for script in p.scripts.alwayson_scripts:
        if script.title().lower() == 'controlnet':
            cn_script_args_len = len(new_args)
            p.script_args[cn_script.args_from:cn_script.args_to] = new_args
            script.args_to = script.args_from + cn_script_len
        else:
            script.args_from += cn_script_args_len
            script.args_to += cn_script_args_len

I did not run the code but the idea is the same: resize the args to include as many units as you want, then offset the args of the other scripts, in case there are other scripts after controlnet.

I think the code to allow other extensions to hack into this extension should be updated to accept many control units/layers, as at the moment, as far as I am aware of, only the first one is usable at the moment from that front.

ljleb avatar Mar 01 '23 23:03 ljleb

Slightly confused by the above...

You can use the p.controlnet_input_image, p.controlnet_mask, etc. properties and the extension should work with these values.

Wait, I can just set those, and they'll be used? OR....

To update the script_args related to controlnet script, you should be able to write something like this:

def update_cn_script_args(p, new_args):
    cn_script_args_len = 0
    for script in p.scripts.alwayson_scripts:
        if script.title().lower() == 'controlnet':
            cn_script_args_len = len(new_args)
            p.script_args[cn_script.args_from:cn_script.args_to] = new_args
            script.args_to = script.args_from + cn_script_len
        else:
            script.args_from += cn_script_len
            script.args_to += cn_script_len

yeah, I was literally coding up something like this when you replied... so if I found the model listed (ie arg idx 4 I think), I could change that as desired.

so which is the 'correct' way? just set p.control_net_model and let CN adjust itself as part of processing, or do I have to go into the script_args and do it that way?

scruffynerf avatar Mar 01 '23 23:03 scruffynerf

The code that handles other extensions feeding data into p instead of using the extension args is here: https://github.com/Mikubill/sd-webui-controlnet/blob/e67ee2730457867f78cdbffd6361626abfed30e8/scripts/controlnet.py#L544-L546

So my assumption is that you can either set properties on p that are listed here: https://github.com/Mikubill/sd-webui-controlnet/blob/e67ee2730457867f78cdbffd6361626abfed30e8/scripts/controlnet.py#L504-L519

Or hack the arguments the webui passes to the controlnet script, like I did above. (and like the api routes are handled)

With the p.control_net_* properties, it doesn't seem like you can use multiple controlnet unit argument groups. I think I'd go for the args hacking if you want to use mutliple control units/layers. There doesn't seem to be a preferred or official way. The p.control_net_* args used to be the right way but supporting multiple ways of calling the script process and postprocess methods adds unnecessary complexity to the code IMO. We should consider adding the above function to the controlnet extension so that people don't have to code it again and again I think.

We could create a ControlUnitArgs or ControlLayerArgs class with default values for convenience from external code, and then pass a list of these instead to the above function as well. Then the function unpacks the args and formats them nicely into p.script_args so that the extension doesn't see any difference from ui values. If we need to know whether the ui called or not, add an additional param is_ui or something after is_img2img.

Edit: I noticed I wasn't using the right p.control_net_* property names. updated this among other details in my comments to not be confusing.

ljleb avatar Mar 02 '23 00:03 ljleb

thanks for the help, @ljleb , just to document this for the next person who searches:

with Image.open(image_path) as img_file:
    img_array = np.asarray(img_file)
img = {}
img['image'] = img_array
img['mask'] = 255 * np.zeros_like(img_array , dtype = np.uint8)
p.control_net_image = img
proc = process_images(p)

(in the above working example, I'm manually setting 'enable' and the model in the CN ui and then hitting generate (with my script active as well), so it doesn't need that set as well, which are probably required if it's all in code.

scruffynerf avatar Mar 02 '23 01:03 scruffynerf

actually I'll reopen this. I think we should close this when the api for external code is clarified / updated.

ljleb avatar Mar 02 '23 02:03 ljleb

@aiton-sd Sorry to bother you. As you wrote some code recently related to external code access, I'd love to hear your opinion on this issue.

How do you think external code should interact with the extension: do we use the extra properties on p: StableDiffusionProcessing instances or adjust the args_from and args_to of ScriptRunner instances on p?

Maybe the /controlnet/*2img API routes and external code could use the same mechanism to interact with the extension or share a bit of code. Is that a bad idea in your opinion?

ljleb avatar Mar 02 '23 17:03 ljleb

Btw, my 2 cents is that having to use the arg method (figuring out where in the pile of all settings, ordered but unlabeled, the changes are needed) is far far more awkward. Right now, so many other elements to mess with are explicit in p.*, and allowing that to work well/better seems more likely to encourage usage in the future.

I'm glad I asked, because I was going down the harder road, and learning that the easy path existed meant my code was cleaner and simpler. (And thus more likely to inspire/empower others to take more steps of development)

scruffynerf avatar Mar 02 '23 18:03 scruffynerf

Btw, my 2 cents is that having to use the arg method (figuring out where in the pile of all settings, ordered but unlabeled, the changes are needed) is far far more awkward. Right now, so many other elements to mess with are explicit in p.*, and allowing that to work well/better seems more likely to encourage usage in the future.

I agree that updating the args is messy. If we write and expose a function with a clear signature for others, then it should be a lot less error prone:

We could create a ControlUnitArgs or ControlLayerArgs class with default values for convenience from external code, and then pass a list of these instead to the above function as well. Then the function unpacks the args and formats them nicely into p.script_args so that the extension doesn't see any difference from ui values. If we need to know whether the ui called or not, add an additional param is_ui or something after is_img2img.

Advantages are: we can update the implementation as much as we want without breaking external code, as the controlnet extension holds the implementation details. Also, external code inherits default values that we also control for convenience. If we need to change anything, as long as it's not the external code interface, we can and it should work seamlessly. There would be no need to support 2 distinct ways of receiving values anymore in process and postprocess, which means less code to maintain in these functions.

Disadvantages I am aware of for updating script_args ourselves: we are responsible for maintaining code that is depending on the structure of the webui host. If something changes or breaks in this code, more maintenance.

The latter can be worked around if we use p props + a functional external code API I think. If using functional API, we could change the implementation as we need if it's too unstable to work with p.script_args hacking.

In my opinion, defining the API with a function will result in more stability. Maybe in both cases, whether p.args or p.control_net_* props, the exposed external code API for interacting with controlnet should be a function like this.

ljleb avatar Mar 02 '23 18:03 ljleb

With the p.control_net_* properties, it doesn't seem like you can use multiple controlnet unit argument groups. I think I'd go for the args hacking if you want to use multiple control units/layers.

Confirmed. If consensus is that p.control_net_* is the way to go, then likely p.control_net_1_* p.control_net_2_* etc (with p.control_net_0_* being considered the same as with no number ie p_control_net_*, if only 1 CN exists) would be ok, if awkward.

so consider this a vote for a single hierarchy that just handles it all smoothly. even just a p.control_net stacked list/dict would be fine, where you could just add units/layers as desired (and it would just error if more than the # that current settings allowed)

As soon as I wanted to try a dual CN using script, now that I have the single case working, the p.script_args approach is so much more cumbersome, but the only way right now that works with multiples, so yes, thanks for leaving this issue open, and hopefully a best practice will arise that solves this going forward.

scruffynerf avatar Mar 02 '23 21:03 scruffynerf

I don't think we want to start counting with the prop names 😅 but yeah I think a list property just like we did for the rest api would work okay as well. These implementation details wouldn't matter if we had a functional external code api I think. We could just add a function in this case if updating existing functions could break existing external code.

ljleb avatar Mar 02 '23 22:03 ljleb

Again, if/until something better comes along... here's my working example of how to do multiCN using p.script_args Code improvements welcomed... I admit to python not being my best language, and defer to others with more skillz. Feel free to point out a better way/method, or to crib code for use as needed.

Assumes p is already global, such as inside an a1111 script's run() function where p is passed in as an arg, and all of the below lives inside that as subfunctions.

        def get_cn_script_args():
            for script in p.scripts.alwayson_scripts:
                if "controlnet.py" in script.filename:
                    return p.script_args[script.args_from+1:script.args_to]
            return []

        def update_cn_script_args(new_args):
            for script in p.scripts.alwayson_scripts:
                if "controlnet.py" in script.filename:
                    if len(p.script_args[script.args_from+1:script.args_to]) == len(new_args):
                        p.script_args = list(p.script_args[:script.args_from+1]) + list(new_args) + list(p.script_args[script.args_to:])
                        return True
            return False
          
        def set_cn_layer(model, image, layer=0, module='none', resize_mode="Scale to Fit (Inner Fit)", 
                         weight=1.0, pres=64, pthr_a=64, pthr_b=64, guidance_start=0, guidance_end=1,
                         scribble_mode=False, rgbbgr_mode=False, lowvram=False, guess_mode=False, enabled=True):
                cn_params = list(get_cn_script_args())
                if 15*(layer+1) <= len(cn_params):
                    idx = 15*layer
                    cn_params[idx] = enabled
                    cn_params[idx+1] = module
                    cn_params[idx+2] = model
                    cn_params[idx+3] = weight
                    cn_params[idx+4] = image
                    cn_params[idx+5] = scribble_mode
                    cn_params[idx+6] = resize_mode
                    cn_params[idx+7] = rgbbgr_mode
                    cn_params[idx+8] = lowvram
                    cn_params[idx+9] = pres
                    cn_params[idx+10] = pthr_a
                    cn_params[idx+11] = pthr_b
                    cn_params[idx+12] = guidance_start
                    cn_params[idx+13] = guidance_end
                    cn_params[idx+14] = guess_mode
                    return update_cn_script_args(cn_params)
                return False

        # 15 params for CN          
        cn_param_names = ["enabled", "module", "model", "weight", "image", 
                        "scribble_mode", "resize_mode", "rgbbgr_mode", "lowvram", "pres", 
                        "pthr_a", "pthr_b", "guidance_start", "guidance_end", "guess_mode"]

        def print_cn_params():
            for i, value in enumerate(get_cn_script_args()):
                print(f"ControlNet-{i//15} {cn_param_names[i%15]} = {value}")

        def set_cn_arg(value, name, layer=0):
            cn_params = list(get_cn_script_args())
            if 15*(layer+1) <= len(cn_params) and name in cn_param_names:
                idx = int(cn_param_names.index(name))+(15*layer)
                cn_params[idx] = value
                return update_cn_script_args(cn_params)
            return False

        def read_cn_arg(name, layer=0):
            cn_params = list(get_cn_script_args())
            if 15*(layer+1) <= len(cn_params) and name in cn_param_names:
                return cn_params[cn_param_names.index(name)+15*layer]
            return False

scruffynerf avatar Mar 03 '23 14:03 scruffynerf

I'm working on a PR for this, I'll open it when I get the controlnet routes in the web api to work with it as a proof of concept.

ljleb avatar Mar 03 '23 15:03 ljleb

I'm finding this on a quest to retrieve the base64 encoded image from a CN unit from inside another script. Is this possible? I'm been playing around with pulling code from ControlNet into my script, or even calling static methods from my own script, but no luck so far. Any ideas?

Thanks!

marcsyp avatar Nov 03 '23 02:11 marcsyp