ComfyUI
ComfyUI copied to clipboard
ClipVision, StyleModel - any example?
Any example of how it works?
Open this PNG file in comfyui, put the style t2i adapter in models/style_models and the clip vision model https://huggingface.co/openai/clip-vit-large-patch14/blob/main/pytorch_model.bin in models/clip_vision

is their a way to control the weight of style model in the generation ?
Open this PNG file in comfyui, put the style t2i adapter in models/style_models and the clip vision model https://huggingface.co/openai/clip-vit-large-patch14/blob/main/pytorch_model.bin in models/clip_vision
I tried this example, but Comfy only throws exception.
I used standalone distribution downloaded awhile ago, didn't work. Did a git pull for latest code, also didn't work. Clean base distribution, no custom nodes.
Error occurred when executing StyleModelApply:
Sizes of tensors must match except in dimension 1. Expected size 1280 but got size 1024 for tensor number 1 in the list.
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 153, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 83, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 76, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\nodes.py", line 867, in apply_stylemodel
cond = style_model.get_cond(clip_vision_output).flatten(start_dim=0, end_dim=1).unsqueeze(dim=0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 281, in get_cond
return self.model(input.last_hidden_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\t2i_adapter\adapter.py", line 214, in forward
x = torch.cat([x, style_embedding], dim=1)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Open this PNG file in comfyui, put the style t2i adapter in models/style_models and the clip vision model https://huggingface.co/openai/clip-vit-large-patch14/blob/main/pytorch_model.bin in models/clip_vision
I tried this example, but Comfy only throws exception.
I used standalone distribution downloaded awhile ago, didn't work. Did a git pull for latest code, also didn't work. Clean base distribution, no custom nodes.
Error occurred when executing StyleModelApply: Sizes of tensors must match except in dimension 1. Expected size 1280 but got size 1024 for tensor number 1 in the list. File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 153, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 83, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\execution.py", line 76, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\nodes.py", line 867, in apply_stylemodel cond = style_model.get_cond(clip_vision_output).flatten(start_dim=0, end_dim=1).unsqueeze(dim=0) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\sd.py", line 281, in get_cond return self.model(input.last_hidden_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1518, in _wrapped_call_impl return self._call_impl(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1527, in _call_impl return forward_call(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\new_ComfyUI_windows_portable_nvidia_cu121_or_cpu\ComfyUI_windows_portable\ComfyUI\comfy\t2i_adapter\adapter.py", line 214, in forward x = torch.cat([x, style_embedding], dim=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I am also getting this error. Did you find a solution that works?
Check if the checkpoint is compatible with the StyleModel/ClipVision model. It is not possible to mix the StyleModel/ClipVision model for SD1.5 with the SDXL checkpoint.
I am using the t2iAdapter models hosted by TencentARC on HF. The particular model was the style model found here: https://huggingface.co/TencentARC/T2I-Adapter/blob/main/models/t2iadapter_style_sd14v1.pth
There are XL models on there but this one is labelled SD14 not SD15 and I wonder if that's the problem, trying to use a SD1.5 checkpoint with a SD1.4 T2IAdapter model.
I'm getting this exact same error as well when trying to use the StyleModelApply node. All the checkpoints I use are SD 1.5, and I've made sure to use an SD 1.5 compatible T2I style adapter, according to a YouTube tutorial I've been following: https://huggingface.co/TencentARC/T2I-Adapter/blob/main/models/coadapter-style-sd15v1.pth
The only other thing I've used clip vision for recently I think is insight face, which worked perfectly fine with the SD 1.5 clip vision I have.
Is there any solution to this problem? I got the same problem when loading any t2i adapter or coadapter in comfyui.
Same here. Tried 3 different t2iStyle models, same error.
clip must use this one (clip-vit-large-patch14)[https://huggingface.co/openai/clip-vit-large-patch14/blob/main/pytorch_model.bin], others may cause the error
I am using the t2iAdapter models hosted by TencentARC on HF. The particular model was the style model found here: https://huggingface.co/TencentARC/T2I-Adapter/blob/main/models/t2iadapter_style_sd14v1.pth
There are XL models on there but this one is labelled SD14 not SD15 and I wonder if that's the problem, trying to use a SD1.5 checkpoint with a SD1.4 T2IAdapter model.
Using that file gives me the below instead (I Also have the pesky Expected-size-1280-but-got-size-1024-error):
Error occurred when executing CLIPVisionLoader:
invalid load key, '\xa8'.
File "C:\ComfyUI_P\ComfyUI\execution.py", line 152, in recursive_execute output_data, output_ui = get_output_data(obj, input_data_all) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\ComfyUI\execution.py", line 82, in get_output_data return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\ComfyUI\execution.py", line 75, in map_node_over_list results.append(getattr(obj, func)(**slice_dict(input_data_all, i))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\ComfyUI\nodes.py", line 889, in load_clip clip_vision = comfy.clip_vision.load(clip_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\ComfyUI\comfy\clip_vision.py", line 112, in load sd = load_torch_file(ckpt_path) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\ComfyUI\comfy\utils.py", line 22, in load_torch_file pl_sd = torch.load(ckpt, map_location=device, pickle_module=comfy.checkpoint_pickle) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\python_embeded\Lib\site-packages\torch\serialization.py", line 1040, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\ComfyUI_P\python_embeded\Lib\site-packages\torch\serialization.py", line 1258, in _legacy_load magic_number = pickle_module.load(f, **pickle_load_args)
I ran into the same issues. I guess this feature is a dead-end at this point since in a year no resolution.
Like other used: "coadapter-style-sd15v1.pth" "sd15/v1-5-pruned-emaonly.safetensors" "CLIP-ViT-H-14-laion2B-s32B-b79K.safetensors"
I think it's safe to say this is working. I have tested it and it seems to be working okay.
Source image:
Output image:
Prompt: A woman walking through a forest
My ComfyUI workflow is embedded in the output image. The model used can be found on Civitai.
clip must use this one (clip-vit-large-patch14)[https://huggingface.co/openai/clip-vit-large-patch14/blob/main/pytorch_model.bin], others may cause the error
I am using the t2iAdapter models hosted by TencentARC on HF. The particular model was the style model found here: https://huggingface.co/TencentARC/T2I-Adapter/blob/main/models/t2iadapter_style_sd14v1.pth There are XL models on there but this one is labelled SD14 not SD15 and I wonder if that's the problem, trying to use a SD1.5 checkpoint with a SD1.4 T2IAdapter model.
Closing as it seems the issue is resolved. Please re-open if anything comes up.