Not working on apple silicon (CogVideoX Fun Sampler Implementation)
!!! Exception during processing !!! unsupported scalarType Traceback (most recent call last): File "/Users/user/AI/ComfyUI/execution.py", line 323, in execute output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) File "/Users/user/AI/ComfyUI/execution.py", line 198, in get_output_data return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb) File "/Users/user/AI/ComfyUI/execution.py", line 169, in _map_node_over_list process_inputs(input_dict, i) File "/Users/user/AI/ComfyUI/execution.py", line 158, in process_inputs results.append(getattr(obj, func)(**inputs)) File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/nodes.py", line 519, in process autocast_context = torch.autocast(mm.get_autocast_device(device)) if autocastcondition else nullcontext() File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/amp/autocast_mode.py", line 229, in init dtype = torch.get_autocast_dtype(device_type) RuntimeError: unsupported scalarType
I probably left the fp8 fast mode on, check that and put it to disabled to see if it resolves this. What GPU are you using?
No it's disabled, i'm on a macbook, the issue seems to be that autocast isn't supported in any pytorch except nightly (As of a week ago) ... so that autocast to fp16 is breaking things... oddly when i went to nightly i started getting errors that in prompt_embeds=positive.to(dtype).to(device), positive is a .. list and doesn't have a .to on list
prompt_embeds=positive.to(dtype).to(device), positive is a .. list and doesn't have a .to on list
Are you using the example workflow?
HAHA I had overlooked that CogVideo was using different text nodes than the stock ones, swapped to those, now that passes, however now seems to be breaking as it appears something is hardcoded to use cuda instead of failing back to mps or cpu if cudas not available .. haven't tracked down where yet...
i updated in the pipeline for pipeline_cogvideox.py where you had a hardcoded torch.device("cuda") to device = torch.device("cuda" if torch.cuda.is_available() else "mps" if torch.backends.mps.is_available() else "cpu") but doesn't seem to be the call thats got me hung, and whats odd is i can't find any other hardcoded references to cuda that would break things.
Traceback (most recent call last):
File "/Users/user/AI/ComfyUI/execution.py", line 323, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/Users/user/AI/ComfyUI/execution.py", line 198, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/Users/user/AI/ComfyUI/execution.py", line 169, in _map_node_over_list
process_inputs(input_dict, i)
File "/Users/user/AI/ComfyUI/execution.py", line 158, in process_inputs
results.append(getattr(obj, func)(**inputs))
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/nodes.py", line 535, in process
latents = pipe(
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/pipeline_cogvideox_inpaint.py", line 634, in __call__
self.vae.to(device)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1340, in to
return self._apply(convert)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 900, in _apply
module._apply(fn)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 900, in _apply
module._apply(fn)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 900, in _apply
module._apply(fn)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 927, in _apply
param_applied = fn(param)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1326, in convert
return t.to(
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/cuda/__init__.py", line 310, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled```
Strange thing, in that inpainting file if i throw a print to see what device is before it tries to send the vae to a device... the device is set to device = self._execution_device... and then device if i print it is "cuda:0"....
Ya i'm not sure where that _execution_device is getting set, even if i hard code that instance of it to "mps" or "cpu" ... it seems somehow it's used elsewhere and its still trying to force things onto cuda... which macs dont have
Ya i'm not sure where that _execution_device is getting set, even if i hard code that instance of it to "mps" or "cpu" ... it seems somehow it's used elsewhere and its still trying to force things onto cuda... which macs dont have
I think it defaults to cuda if it can't find it from accelerate... dunno why that wouldn't work, you can try just forcing the execution device to mps though.
if you mean trying to just self._execution_device = "mps" wont work its apparently not allowed.
AttributeError: can't set attribute '_execution_device'...
A bit of digging it seems that diffusers returns the device thats set in _hf_hook in the model... which is returning cuda:0
Potentially found the reason: I wasn't calling the enable_model_cpu_offload with a device, so that would make it default to cuda.
yep that solved that issue, so now with 2.6.0-dev pytorch (for the autocast to work in the pipeline)... it doesn't give the device error anymore.... Great catch, optional properties are so easy to overlook in these codebases
So close to it running lol i can feel it! XD
Now the hang is at ...
File "/Users/cc/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/autoencoder_magvit.py", line 64, in forward
return super().forward(input)
File "/Users/cc/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 725, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/Users/cc/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 720, in _conv_forward
return F.conv3d(
RuntimeError: Input type (float) and bias type (c10::Half) should be the same
Get the feeling the dtype is being not passed somewhere it needs to be for float16
Ya i'm not sure why it seems that conv3d is sometimes a float32... and the input is float16...
this is my setup btw
and heres the full trace
!!! Exception during processing !!! Input type (float) and bias type (c10::Half) should be the same
Traceback (most recent call last):
File "/Users/user/AI/ComfyUI/execution.py", line 323, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/Users/user/AI/ComfyUI/execution.py", line 198, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/Users/user/AI/ComfyUI/execution.py", line 169, in _map_node_over_list
process_inputs(input_dict, i)
File "/Users/user/AI/ComfyUI/execution.py", line 158, in process_inputs
results.append(getattr(obj, func)(**inputs))
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/nodes.py", line 530, in process
latents = pipe(
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/pipeline_cogvideox_inpaint.py", line 719, in __call__
_, masked_video_latents = self.prepare_mask_latents(
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/pipeline_cogvideox_inpaint.py", line 340, in prepare_mask_latents
mask_pixel_values_bs = self.vae.encode(mask_pixel_values_bs)[0]
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/autoencoder_magvit.py", line 1120, in encode
z_intermediate = self.encoder(z_intermediate)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/autoencoder_magvit.py", line 739, in forward
hidden_states = down_block(hidden_states, temb, None)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/autoencoder_magvit.py", line 415, in forward
hidden_states = resnet(hidden_states, temb, zq)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/autoencoder_magvit.py", line 297, in forward
hidden_states = self.conv1(hidden_states)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/autoencoder_magvit.py", line 149, in forward
output = self.conv(inputs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/autoencoder_magvit.py", line 69, in forward
return super().forward(input)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 725, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 720, in _conv_forward
return F.conv3d(
RuntimeError: Input type (float) and bias type (c10::Half) should be the same
also i saw you mention 0.30.3 is required for diffusers it was on 0.30.2 but upgrading didn't change anything.
Diffusers 0.30.3 is required for the official I2V model only, not the "Fun" variant.
Does that work for you btw, or is this only issue with the "Fun" models?
cleared my folder and pulled latest from git repo ... and tested with the 2b models with the respective sampler... got very similar errors but... slightly different
with standard 2b (first in list)...
!!! Exception during processing !!! Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: float and value.dtype: c10::Half instead.
Traceback (most recent call last):
File "/Users/user/AI/ComfyUI/execution.py", line 323, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/Users/user/AI/ComfyUI/execution.py", line 198, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/Users/user/AI/ComfyUI/execution.py", line 169, in _map_node_over_list
process_inputs(input_dict, i)
File "/Users/user/AI/ComfyUI/execution.py", line 158, in process_inputs
results.append(getattr(obj, func)(**inputs))
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/nodes.py", line 455, in process
latents = pipeline["pipe"](
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/pipeline_cogvideox.py", line 607, in __call__
noise_pred = self.transformer(
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/diffusers/models/transformers/cogvideox_transformer_3d.py", line 456, in forward
hidden_states, encoder_hidden_states = block(
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/diffusers/models/transformers/cogvideox_transformer_3d.py", line 131, in forward
attn_hidden_states, attn_encoder_hidden_states = self.attn1(
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 490, in forward
return self.processor(
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/diffusers/models/attention_processor.py", line 1925, in __call__
hidden_states = F.scaled_dot_product_attention(
RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: float and value.dtype: c10::Half instead.
with fun 2b...
!!! Exception during processing !!! Input type (float) and bias type (c10::Half) should be the same
Traceback (most recent call last):
File "/Users/user/AI/ComfyUI/execution.py", line 323, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/Users/user/AI/ComfyUI/execution.py", line 198, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/Users/user/AI/ComfyUI/execution.py", line 169, in _map_node_over_list
process_inputs(input_dict, i)
File "/Users/user/AI/ComfyUI/execution.py", line 158, in process_inputs
results.append(getattr(obj, func)(**inputs))
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/nodes.py", line 641, in process
latents = pipe(
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/pipeline_cogvideox_inpaint.py", line 718, in __call__
_, masked_video_latents = self.prepare_mask_latents(
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/pipeline_cogvideox_inpaint.py", line 339, in prepare_mask_latents
mask_pixel_values_bs = self.vae.encode(mask_pixel_values_bs)[0]
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/diffusers/utils/accelerate_utils.py", line 46, in wrapper
return method(self, *args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/autoencoder_magvit.py", line 1114, in encode
z_intermediate = self.encoder(z_intermediate)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/autoencoder_magvit.py", line 733, in forward
hidden_states = down_block(hidden_states, temb, None)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/autoencoder_magvit.py", line 409, in forward
hidden_states = resnet(hidden_states, temb, zq)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/autoencoder_magvit.py", line 291, in forward
hidden_states = self.conv1(hidden_states)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/autoencoder_magvit.py", line 144, in forward
output = self.conv(inputs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/Users/user/AI/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/cogvideox_fun/autoencoder_magvit.py", line 64, in forward
return super().forward(input)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 725, in forward
return self._conv_forward(input, self.weight, self.bias)
File "/Users/user/AI/ComfyUI/venv/lib/python3.10/site-packages/torch/nn/modules/conv.py", line 720, in _conv_forward
return F.conv3d(
RuntimeError: Input type (float) and bias type (c10::Half) should be the same
I tried to resolve the above - running the 5B I2V model - it seems to be a deeper issue within the CogVideo diffuser model or in the MPS implementation of pytorch (though I can't be sure). I am leaving these details here, in case someone picks this up:
- The precision sent from the codebase in this repository seems to be working correctly (I was running at float32 precision and all the tensors sent to the underlying model had the same precision)
- After the following line of code in the diffuser library: https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py#L1924 the query and key tensors have dtype of float32 whereas the value tensor had a dtype of float16 (which seems to be an issue reading the above)
At point 2, I tried forcing the precision as float32 for all tensors and also forcing them to float16 before the call to: scaled_dot_product_attention. In both cases, my macbook gave an OOO error (I have a 36 GB RAM model).
Might try to set this up on a GPU instance somewhere using an Nvidia card ¯_(ツ)_/¯
Well the float32 precision will likely oomw without any bugs or issues, bf16 they show at 16gb (confirmed as it can oom even on T4 colab) they even mention in the colabs that they can oom on 16gb vram and memory, I imagine some of this in this comfy extension is the tensor shuffling around chewing up memory but definitly think it needs to run in fp16 to have a chance of running locally on a 36gb…
keep in mind on Mac’s offloading doesn’t do anything as it’s unified vram/ram we’d have to swap to completely unloading extraneous stuff not just shifting it to cpu
I have a similar issue running the 5B I2V model on MacBookPro M3 Max (128 RAM, Sonoma latest). Python 3.12.4 (miniconda3), pytorch 2.6.0.dev20240924 This happens regardless of using flags --force-fp16, --force-fp32, --dont-upcast-attention.
RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: float and value.dtype: c10::Half instead.
Let me know if you'd prefer I open another issue or run a few tests given this machine's memory. I've also tried brute forcing types as @digvijay7 mentioned above, to no avail. Any insights are welcome, thanks!
The full error output is:
** Python version: 3.12.4 | packaged by Anaconda, Inc. | (main, Jun 18 2024, 10:07:17) [Clang 14.0.6 ]
** Python executable: /Users/u/miniconda3/bin/python
** ComfyUI Path: /Users/u/ComfyUI
** Log path: /Users/u/ComfyUI/comfyui.log
Prestartup times for custom nodes:
0.0 seconds: /Users/u/ComfyUI/custom_nodes/rgthree-comfy
0.3 seconds: /Users/u/ComfyUI/custom_nodes/ComfyUI-Manager
Total VRAM 131072 MB, total RAM 131072 MB
pytorch version: 2.6.0.dev20240924
Forcing FP16.
Set vram state to: SHARED
Device: mps
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
[Prompt Server] web root: /Users/u/ComfyUI/web
/Users/u/miniconda3/lib/python3.12/site-packages/kornia/feature/lightglue.py:44: FutureWarning: `torch.cuda.amp.custom_fwd(args...)` is deprecated. Please use `torch.amp.custom_fwd(args..., device_type='cuda')` instead.
@torch.cuda.amp.custom_fwd(cast_inputs=torch.float32)
### Loading: ComfyUI-Manager (V2.51.1)
### ComfyUI Revision: 2727 [fdf37566] | Released on '2024-09-24'
[rgthree] Loaded 42 exciting nodes.
[rgthree] NOTE: Will NOT use rgthree's optimized recursive execution as ComfyUI has changed.
Total VRAM 131072 MB, total RAM 131072 MB
pytorch version: 2.6.0.dev20240924
Forcing FP16.
Set vram state to: SHARED
Device: mps
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/github-stats.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
Import times for custom nodes:
0.0 seconds: /Users/u/ComfyUI/custom_nodes/websocket_image_save.py
0.0 seconds: /Users/u/ComfyUI/custom_nodes/rgthree-comfy
0.0 seconds: /Users/u/ComfyUI/custom_nodes/ComfyUI-KJNodes
0.0 seconds: /Users/u/ComfyUI/custom_nodes/ComfyUI-Manager
0.1 seconds: /Users/u/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper
0.2 seconds: /Users/u/ComfyUI/custom_nodes/ComfyUI-VideoHelperSuite
Starting server
To see the GUI go to: http://127.0.0.1:8188
got prompt
Encoded latents shape: torch.Size([1, 1, 16, 60, 90])
/Users/u/miniconda3/lib/python3.12/site-packages/transformers/tokenization_utils_base.py:1617: FutureWarning: `clean_up_tokenization_spaces` was not set. It will be set to `True` by default. This behavior will be deprecated in transformers v4.45, and will be then set to `False` by default. For more details check this issue: https://github.com/huggingface/transformers/issues/31884
warnings.warn(
Requested to load SD3ClipModel_
Loading 1 new model
loaded completely 0.0 4541.693359375 True
Temporal tiling disabled
0%| | 0/50 [00:00<?, ?it/s]huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
To disable this warning, you can either:
- Avoid using `tokenizers` before the fork if possible
- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
0%| | 0/50 [00:00<?, ?it/s]
!!! Exception during processing !!! Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: float and value.dtype: c10::Half instead.
Traceback (most recent call last):
File "/Users/u/ComfyUI/execution.py", line 323, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/u/ComfyUI/execution.py", line 198, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/u/ComfyUI/execution.py", line 169, in _map_node_over_list
process_inputs(input_dict, i)
File "/Users/u/ComfyUI/execution.py", line 158, in process_inputs
results.append(getattr(obj, func)(**inputs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/u/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/nodes.py", line 843, in process
latents = pipeline["pipe"](
^^^^^^^^^^^^^^^^^
File "/Users/u/miniconda3/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/Users/u/ComfyUI/custom_nodes/ComfyUI-CogVideoXWrapper/pipeline_cogvideox.py", line 615, in __call__
noise_pred = self.transformer(
^^^^^^^^^^^^^^^^^
File "/Users/u/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/u/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/u/miniconda3/lib/python3.12/site-packages/diffusers/models/transformers/cogvideox_transformer_3d.py", line 456, in forward
hidden_states, encoder_hidden_states = block(
^^^^^^
File "/Users/u/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/u/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/u/miniconda3/lib/python3.12/site-packages/diffusers/models/transformers/cogvideox_transformer_3d.py", line 131, in forward
attn_hidden_states, attn_encoder_hidden_states = self.attn1(
^^^^^^^^^^^
File "/Users/u/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/u/miniconda3/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/u/miniconda3/lib/python3.12/site-packages/diffusers/models/attention_processor.py", line 490, in forward
return self.processor(
^^^^^^^^^^^^^^^
File "/Users/u/miniconda3/lib/python3.12/site-packages/diffusers/models/attention_processor.py", line 1925, in __call__
hidden_states = F.scaled_dot_product_attention(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: float and value.dtype: c10::Half instead.
Can confirm this is an issue on M2 Max chips. https://github.com/pytorch/pytorch/issues/110285 just gonna leave this here.