Controlnet CNET and CLite models are completely broken on SDXL
Tested with Pony Diffusion, I used both a Controllite SDXL Canny Model and a Controlnet SDXL line art model, neither of them were able to generate. I am using the latest version of the repo.
Example of a CN Lineart Model:
2024-08-25 01:44:15,628 - ControlNet - INFO - Preview Resolution = 1024 2024-08-25 01:44:22,350 - ControlNet - INFO - Preview Resolution = 1024 2024-08-25 01:44:24,826 - ControlNet - INFO - ControlNet Input Mode: InputMode.SIMPLE 2024-08-25 01:44:25,727 - ControlNet - INFO - Using preprocessor: lineart_standard (from white bg & black line) 2024-08-25 01:44:25,727 - ControlNet - INFO - preprocessor resolution = 1024 2024-08-25 01:44:26,031 - ControlNet - INFO - Current ControlNet ControlLLLitePatcher: U:\SD\stable-diffusion-automatic1111-webui\models\ControlNet\controlnetxlCNXL_bdsqlszLineart.safetensors INFO:sd_dynamic_prompts.dynamic_prompting:Prompt matrix will create 8 images in a total of 1 batches. To load target model JointTextEncoder Begin to load 1 model [Unload] Trying to free 5650.87 MB for cuda:0 with 0 models keep loaded ... [Unload] Current free memory is 3839.57 MB ... [Unload] Unload model KModel [Memory Management] Current Free GPU Memory: 9544.89 MB [Memory Management] Required Model Memory: 1752.98 MB [Memory Management] Required Inference Memory: 3372.00 MB [Memory Management] Estimated Remaining GPU Memory: 4419.91 MB Moving model(s) has taken 6.16 seconds [Unload] Trying to free 3372.00 MB for cuda:0 with 1 models keep loaded ... [Unload] Current free memory is 7785.09 MB ... 136 modules 2024-08-25 01:44:33,314 - ControlNet - INFO - ControlNet Method lineart_standard (from white bg & black line) patched. To load target model KModel Begin to load 1 model [Unload] Trying to free 17148.88 MB for cuda:0 with 0 models keep loaded ... [Unload] Current free memory is 7781.50 MB ... [Unload] Unload model JointTextEncoder [Memory Management] Current Free GPU Memory: 9540.69 MB [Memory Management] Required Model Memory: 4897.05 MB [Memory Management] Required Inference Memory: 3372.00 MB [Memory Management] Estimated Remaining GPU Memory: 1271.64 MB Moving model(s) has taken 6.05 seconds 0%| | 0/32 [00:00<?, ?it/s] Traceback (most recent call last): File "U:\SD\forge-webui\webui\modules_forge\main_thread.py", line 30, in work self.result = self.func(*self.args, **self.kwargs) File "U:\SD\forge-webui\webui\modules\txt2img.py", line 112, in txt2img_function processed = processing.process_images(p) File "U:\SD\forge-webui\webui\modules\processing.py", line 818, in process_images res = process_images_inner(p) File "U:\SD\forge-webui\webui\modules\processing.py", line 961, in process_images_inner samples_ddim = p.sample(conditioning=p.c, unconditional_conditioning=p.uc, seeds=p.seeds, subseeds=p.subseeds, subseed_strength=p.subseed_strength, prompts=p.prompts) File "U:\SD\forge-webui\webui\modules\processing.py", line 1332, in sample samples = self.sampler.sample(self, x, conditioning, unconditional_conditioning, image_conditioning=self.txt2img_image_conditioning(x)) File "U:\SD\forge-webui\webui\modules\sd_samplers_kdiffusion.py", line 238, in sample samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs)) File "U:\SD\forge-webui\webui\modules\sd_samplers_common.py", line 272, in launch_sampling return func() File "U:\SD\forge-webui\webui\modules\sd_samplers_kdiffusion.py", line 238, in <lambda> samples = self.launch_sampling(steps, lambda: self.func(self.model_wrap_cfg, x, extra_args=self.sampler_extra_args, disable=False, callback=self.callback_state, **extra_params_kwargs)) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "U:\SD\forge-webui\webui\k_diffusion\sampling.py", line 627, in sample_dpmpp_2m_sde denoised = model(x, sigmas[i] * s_in, **extra_args) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "U:\SD\forge-webui\webui\modules\sd_samplers_cfg_denoiser.py", line 196, in forward denoised, cond_pred, uncond_pred = sampling_function(self, denoiser_params=denoiser_params, cond_scale=cond_scale, cond_composition=cond_composition) File "U:\SD\forge-webui\webui\backend\sampling\sampling_function.py", line 362, in sampling_function denoised, cond_pred, uncond_pred = sampling_function_inner(model, x, timestep, uncond, cond, cond_scale, model_options, seed, return_full=True) File "U:\SD\forge-webui\webui\backend\sampling\sampling_function.py", line 303, in sampling_function_inner cond_pred, uncond_pred = calc_cond_uncond_batch(model, cond, uncond_, x, timestep, model_options) File "U:\SD\forge-webui\webui\backend\sampling\sampling_function.py", line 273, in calc_cond_uncond_batch output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks) File "U:\SD\forge-webui\webui\backend\modules\k_model.py", line 45, in apply_model model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float() File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "U:\SD\forge-webui\webui\backend\nn\unet.py", line 713, in forward h = module(h, emb, context, transformer_options) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "U:\SD\forge-webui\webui\backend\nn\unet.py", line 83, in forward x = layer(x, context, transformer_options) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "U:\SD\forge-webui\webui\backend\nn\unet.py", line 321, in forward x = block(x, context=context[i], transformer_options=transformer_options) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "U:\SD\forge-webui\webui\backend\nn\unet.py", line 181, in forward return checkpoint(self._forward, (x, context, transformer_options), None, self.checkpoint) File "U:\SD\forge-webui\webui\backend\nn\unet.py", line 12, in checkpoint return f(*args) File "U:\SD\forge-webui\webui\backend\nn\unet.py", line 216, in _forward n, context_attn1, value_attn1 = p(n, context_attn1, value_attn1, extra_options) File "U:\SD\forge-webui\webui\extensions-builtin\sd_forge_controlllite\lib_controllllite\lib_controllllite.py", line 99, in __call__ q = q + self.modules[module_pfx_to_q](q) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "U:\SD\forge-webui\webui\extensions-builtin\sd_forge_controlllite\lib_controllllite\lib_controllllite.py", line 213, in forward cx = self.conditioning1(self.cond_image.to(x.device, dtype=x.dtype)) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\container.py", line 217, in forward input = module(input) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl return forward_call(*args, **kwargs) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\conv.py", line 460, in forward return self._conv_forward(input, self.weight, self.bias) File "U:\SD\forge-webui\system\python\lib\site-packages\torch\nn\modules\conv.py", line 456, in _conv_forward return F.conv2d(input, weight, bias, self.stride, RuntimeError: Input type (struct c10::Half) and bias type (float) should be the same Input type (struct c10::Half) and bias type (float) should be the same
controlnetxlCNXL_bdsqlszLineart.safetensors working for me with a Pony realistic merge and with a Lightning model. Also canny, lineart, depth tested with promax controlnet - all working correctly.
controlnetxlCNXL_bdsqlszLineart.safetensors working for me with a Pony realistic merge and with a Lightning model. Also canny, lineart, depth tested with promax controlnet - all working correctly.
Bizzare. Do you have anything like --all-in-fp32 or xformers disabled? I tried the former and it allowed controlnet to work (but extremely, extremely slowly) and I have heard xformers has caused issues with controlnet in the past.
If it's not that then I can only assume it has something to do with image resolution, generation settings, or controlnet settings. I'll post mine in a bit.
I tried some different images resolutions up to 1024x1536, and different controlnet resolutions (512, 648 and 1024). DPM++2M SDE and then others. Also batch and single image. Fully updated. I'm not using any command line arguments, no xformers - so your idea there might be right.
I tried some different images resolutions up to 1024x1536, and different controlnet resolutions (512, 648 and 1024). DPM++2M SDE and then others. Also batch and single image. Fully updated. I'm not using any command line arguments, no xformers - so your idea there might be right.
Sorry, I meant to reply to this before but last time I tried I had bad luck. I checked the repo right when it was locked down due to the malware bots earlier.
I'm still having the issue as of the latest version of the repo. I've tried to minimize the possibilities as much as possible and I'm still confused.
Allow me to try to provide every possible variable for reproducibility. Happens with and without xformers. With and without LORAs.
I have a 2080Ti, latest Geforce Drivers.
Both the generation resolution and controlnet image are now 1024x1024, still happens. I am using the base PonyDiffusion model, not a finetune of Pony, although I don't know if that would really cause the issue.
My Forge Install is using CUDA 12.1 and PyTorch 2.3.1. I am currently testing with the controllllite model controlnetxlCNXL_bdsqlszLineart [74db2627], though it happens with others too.
Generation Parameters:
Steps: 25, Sampler: Euler a, Schedule type: Automatic, CFG scale: 5, Seed: 1657669160, Size: 1024x1024, Model hash: 8cd86b11ad, Model: ponyDiffusionV6XL_v6StartWithThisOne, ControlNet 0: "Module: lineart_standard (from white bg & black line), Model: controlnetxlCNXL_bdsqlszLineart [74db2627], Weight: 1, Resize Mode: Crop and Resize, Processor Res: 512, Threshold A: 0.5, Threshold B: 0.5, Guidance Start: 0.0, Guidance End: 1.0, Pixel Perfect: True, Control Mode: Balanced, Hr Option: Both"
It also happens with or without pixel perfect. Not sure if it would be a rounding issue caused by resolution of the control images. I once saw a bug caused by images with odd numbered resolution sizes.
Please give this a try with latest.
Tested with both controllllite_v01032064e_sdxl_canny_anime [8eef53e1] and controlnetxlCNXL_bdsqlszLineart [74db2627]. The latter seems to be low quality, but I am not sure if that is an issue of the controlnet implementation or perhaps just the controlnet model itself. I seem to remember generations seeming higher quality when I used to use A1111, but I may be misremembering.
Either way, controlnet generation seems to be working again! Thank you very much for your hard work!
That's interesting to hear, I did notice during testing that Pony based models did not work very well with bdsqlsz models, whereas models like Protovision worked much better. I'd be interested to know what model you were using to investigate further.
That's interesting to hear, I did notice during testing that Pony based models did not work very well with bdsqlsz models, whereas models like Protovision worked much better. I'd be interested to know what model you were using to investigate further.
https://civitai.com/models/136070?modelVersionId=267516
I believe I downloaded the lineart model from here a while back on Civit. It's entirely possible the model itself is just flawed, as it was an old ControlNet model from 2023. Yet, I still swear I remember it running much better. I'm testing with base Pony, if that makes any difference. It's difficult for me to test in detail as controlnet also feels slower than before when I tested on A1111. Eventually I ran into an issue with pytorch that keeps it from running on my GPU, which is why I switched to Forge. Seems to almost double the generation time, and make a full batch size crash with union promax. I have 11gb of vram, 32gb of ram, and a 150gb page file on an nvme ssd, so I am wondering why it uses so many resources. If that exists it's likely a separate issue, but I don't have any metrics so I can't say for certain if I simply misremember, or if it was because I was using sd 1.5 for controlnet.