ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

ClipVision, StyleModel - any example?

Open DaveScream opened this issue 2 years ago • 13 comments

Any example of how it works?

DaveScream avatar Mar 13 '23 23:03 DaveScream

Open this PNG file in comfyui, put the style t2i adapter in models/style_models and the clip vision model https://huggingface.co/openai/clip-vit-large-patch14/blob/main/pytorch_model.bin in models/clip_vision ComfyUI_38460_

comfyanonymous avatar Mar 20 '23 20:03 comfyanonymous

is their a way to control the weight of style model in the generation ?

fraxcom avatar May 05 '23 13:05 fraxcom

Open this PNG file in comfyui, put the style t2i adapter in models/style_models and the clip vision model https://huggingface.co/openai/clip-vit-large-patch14/blob/main/pytorch_model.bin in models/clip_vision ComfyUI_38460_

I tried this example, but Comfy only throws exception.

I used standalone distribution downloaded awhile ago, didn't work. Did a git pull for latest code, also didn't work. Clean base distribution, no custom nodes.

Error occurred when executing StyleModelApply:

Sizes of tensors must match except in dimension 1. Expected size 1280 but got size 1024 for tensor number 1 in the list.

File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 153, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 83, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 76, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\nodes.py", line 867, in apply_stylemodel
cond = style_model.get_cond(clip_vision_output).flatten(start_dim=0, end_dim=1).unsqueeze(dim=0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 281, in get_cond
return self.model(input.last_hidden_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\t2i_adapter\adapter.py", line 214, in forward
x = torch.cat([x, style_embedding], dim=1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

justanothernguyen avatar Dec 08 '23 15:12 justanothernguyen

Open this PNG file in comfyui, put the style t2i adapter in models/style_models and the clip vision model https://huggingface.co/openai/clip-vit-large-patch14/blob/main/pytorch_model.bin in models/clip_vision ComfyUI_38460_

I tried this example, but Comfy only throws exception.

I used standalone distribution downloaded awhile ago, didn't work. Did a git pull for latest code, also didn't work. Clean base distribution, no custom nodes.

Error occurred when executing StyleModelApply:

Sizes of tensors must match except in dimension 1. Expected size 1280 but got size 1024 for tensor number 1 in the list.

File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 153, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 83, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 76, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\nodes.py", line 867, in apply_stylemodel
cond = style_model.get_cond(clip_vision_output).flatten(start_dim=0, end_dim=1).unsqueeze(dim=0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 281, in get_cond
return self.model(input.last_hidden_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\t2i_adapter\adapter.py", line 214, in forward
x = torch.cat([x, style_embedding], dim=1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

I am also getting this error. Did you find a solution that works?

traugdor avatar Jan 04 '24 16:01 traugdor

Check if the checkpoint is compatible with the StyleModel/ClipVision model. It is not possible to mix the StyleModel/ClipVision model for SD1.5 with the SDXL checkpoint.

ltdrdata avatar Jan 05 '24 00:01 ltdrdata

I am using the t2iAdapter models hosted by TencentARC on HF. The particular model was the style model found here: https://huggingface.co/TencentARC/T2I-Adapter/blob/main/models/t2iadapter_style_sd14v1.pth

There are XL models on there but this one is labelled SD14 not SD15 and I wonder if that's the problem, trying to use a SD1.5 checkpoint with a SD1.4 T2IAdapter model.

traugdor avatar Jan 05 '24 00:01 traugdor

I'm getting this exact same error as well when trying to use the StyleModelApply node. All the checkpoints I use are SD 1.5, and I've made sure to use an SD 1.5 compatible T2I style adapter, according to a YouTube tutorial I've been following: https://huggingface.co/TencentARC/T2I-Adapter/blob/main/models/coadapter-style-sd15v1.pth

The only other thing I've used clip vision for recently I think is insight face, which worked perfectly fine with the SD 1.5 clip vision I have.

crafter312 avatar Jan 07 '24 01:01 crafter312

Is there any solution to this problem? I got the same problem when loading any t2i adapter or coadapter in comfyui.

cupret avatar Jan 18 '24 09:01 cupret

Same here. Tried 3 different t2iStyle models, same error.

Pos13 avatar Jan 27 '24 15:01 Pos13

clip must use this one (clip-vit-large-patch14)[https://huggingface.co/openai/clip-vit-large-patch14/blob/main/pytorch_model.bin], others may cause the error

I am using the t2iAdapter models hosted by TencentARC on HF. The particular model was the style model found here: https://huggingface.co/TencentARC/T2I-Adapter/blob/main/models/t2iadapter_style_sd14v1.pth

There are XL models on there but this one is labelled SD14 not SD15 and I wonder if that's the problem, trying to use a SD1.5 checkpoint with a SD1.4 T2IAdapter model.

sphantix avatar Feb 02 '24 02:02 sphantix

Using that file gives me the below instead (I Also have the pesky Expected-size-1280-but-got-size-1024-error):

Error occurred when executing CLIPVisionLoader:

invalid load key, '\xa8'.

File "C:\ComfyUI_P\ComfyUI\execution.py", line 152, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\ComfyUI\execution.py", line 82, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\ComfyUI\execution.py", line 75, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\ComfyUI\nodes.py", line 889, in load_clip clip_vision = comfy.clip_vision.load(clip_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\ComfyUI\comfy\clip_vision.py", line 112, in load sd = load_torch_file(ckpt_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\ComfyUI\comfy\utils.py", line 22, in load_torch_file pl_sd = torch.load(ckpt, map_location=device, pickle_module=comfy.checkpoint_pickle) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\python_embeded\Lib\site-packages\torch\serialization.py", line 1040, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\python_embeded\Lib\site-packages\torch\serialization.py", line 1258, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args)

Kallamamran avatar Mar 19 '24 13:03 Kallamamran

I ran into the same issues. I guess this feature is a dead-end at this point since in a year no resolution.

Like other used: "coadapter-style-sd15v1.pth" "sd15/v1-5-pruned-emaonly.safetensors" "CLIP-ViT-H-14-laion2B-s32B-b79K.safetensors"

geraldthewes avatar Apr 15 '24 12:04 geraldthewes

I think it's safe to say this is working. I have tested it and it seems to be working okay.

Source image: 00006-703160079

Output image: ComfyUI_temp_srxpf_00001_

Prompt: A woman walking through a forest

My ComfyUI workflow is embedded in the output image. The model used can be found on Civitai.

clip must use this one (clip-vit-large-patch14)[https://huggingface.co/openai/clip-vit-large-patch14/blob/main/pytorch_model.bin], others may cause the error

I am using the t2iAdapter models hosted by TencentARC on HF. The particular model was the style model found here: https://huggingface.co/TencentARC/T2I-Adapter/blob/main/models/t2iadapter_style_sd14v1.pth There are XL models on there but this one is labelled SD14 not SD15 and I wonder if that's the problem, trying to use a SD1.5 checkpoint with a SD1.4 T2IAdapter model.

traugdor avatar Apr 15 '24 14:04 traugdor

Closing as it seems the issue is resolved. Please re-open if anything comes up.

robinjhuang avatar Jul 03 '24 21:07 robinjhuang