CLIPTextEncode Out of Memory Error since update

Open BenDes21 opened this issue 1 year ago • 5 comments

Expected Behavior

workflow run normally

Actual Behavior

Hi, getting CLIPTextEncode Allocation on device after update comfyui.., with the same workflow ( didnt modify anything ) and before all was good.

Steps to Reproduce

use any workflow with the new version

Debug Logs

File "C:\Users\Admin\Documents\ComfyUI_windows_portable\ComfyUI\comfy\model_patcher.py", line 427, in patch_model
    self.load(device_to, lowvram_model_memory=lowvram_model_memory, force_patch_weights=force_patch_weights, full_load=full_load)
  File "C:\Users\Admin\Documents\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-GGUF\nodes.py", line 186, in load
    super().load(*args, force_patch_weights=True, **kwargs)
  File "C:\Users\Admin\Documents\ComfyUI_windows_portable\ComfyUI\comfy\model_patcher.py", line 399, in load
    x[2].to(device_to)
  File "C:\Users\Admin\Documents\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1173, in to
    return self._apply(convert)
           ^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Admin\Documents\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 804, in _apply
    param_applied = fn(param)
                    ^^^^^^^^^
  File "C:\Users\Admin\Documents\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1159, in convert
    return t.to(
           ^^^^^
  File "C:\Users\Admin\Documents\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-GGUF\ops.py", line 23, in to
    new = super().to(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Admin\Documents\ComfyUI_windows_portable\python_embeded\Lib\site-packages\torch\_tensor.py", line 1443, in __torch_function__
    ret = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
torch.cuda.OutOfMemoryError: Allocation on device

Got an OOM, unloading all loaded models.
Prompt executed in 4.43 seconds

Other

No response

Sep 09 '24 09:09 BenDes21

same thing happened to me :(

Sep 11 '24 02:09 TheValkyr

same

Sep 11 '24 17:09 9Somboon

same

Sep 11 '24 22:09 AltoresMonaco

same

Sep 14 '24 14:09 markswjang

I'm experiencing something similar. Every time I change the text prompt. Once I've run the prompt I can do it again successfully. But it fails every first try. Here's the log

ComfyUI Error Report

Error Details

Node Type: CLIPTextEncode
Exception Type: torch.OutOfMemoryError
Exception Message: Allocation on device

Stack Trace

  File "/workspace/ComfyUI/execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)

  File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)

  File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)

  File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))

  File "/workspace/ComfyUI/nodes.py", line 65, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)

  File "/workspace/ComfyUI/comfy/sd.py", line 125, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)

  File "/workspace/ComfyUI/comfy/text_encoders/flux.py", line 59, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)

  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)

  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 229, in encode
    return self(tokens)

  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)

  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)

  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)

  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)

  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)

  File "/workspace/ComfyUI/comfy/text_encoders/t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))

  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)

  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)

  File "/workspace/ComfyUI/comfy/ops.py", line 211, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)

  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 152, in forward_comfy_cast_weights
    return self.forward_ggml_cast_weights(input, *args, **kwargs)

  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 187, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)

  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 146, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)

  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 119, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)

  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py", line 23, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)

  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 23, in to
    new = super().to(*args, **kwargs)

  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/_tensor.py", line 1437, in __torch_function__
    ret = func(*args, **kwargs)

System Information

ComfyUI Version: v0.2.3-8-gf584758
Arguments: main.py --disable-auto-launch --port 18188
OS: posix
Python Version: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]
Embedded Python: false
PyTorch Version: 2.4.1+cu121

Devices

Name: cuda:0 NVIDIA RTX 6000 Ada Generation : cudaMallocAsync
- Type: cuda
- VRAM Total: 51010207744
- VRAM Free: 18704695296
- Torch VRAM Total: 67108864
- Torch VRAM Free: 58589184

Logs

2024-10-15 19:01:34,134 - root - ERROR - Traceback (most recent call last):
  File "/workspace/ComfyUI/execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/workspace/ComfyUI/nodes.py", line 65, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
  File "/workspace/ComfyUI/comfy/sd.py", line 125, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "/workspace/ComfyUI/comfy/text_encoders/flux.py", line 59, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 229, in encode
    return self(tokens)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/text_encoders/t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/ops.py", line 211, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 152, in forward_comfy_cast_weights
    return self.forward_ggml_cast_weights(input, *args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 187, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 146, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 119, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py", line 23, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 23, in to
    new = super().to(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/_tensor.py", line 1437, in __torch_function__
    ret = func(*args, **kwargs)
torch.OutOfMemoryError: Allocation on device 

2024-10-15 19:01:34,137 - root - ERROR - Got an OOM, unloading all loaded models.
2024-10-15 19:01:36,154 - root - INFO - Prompt executed in 2.54 seconds
2024-10-15 19:01:40,556 - root - INFO - got prompt
2024-10-15 19:01:40,573 - root - INFO - Requested to load FluxClipModel_
2024-10-15 19:01:40,573 - root - INFO - Loading 1 new model
2024-10-15 19:01:41,145 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 19:01:41,310 - root - INFO - Requested to load Flux
2024-10-15 19:01:41,310 - root - INFO - Loading 1 new model
2024-10-15 19:01:43,011 - root - INFO - loaded completely 0.0 12099.961059570312 True
2024-10-15 19:01:44,551 - root - INFO - Requested to load AutoencodingEngine
2024-10-15 19:01:44,551 - root - INFO - Loading 1 new model
2024-10-15 19:01:44,579 - root - INFO - loaded completely 0.0 159.87335777282715 True
2024-10-15 19:01:44,781 - root - INFO - Prompt executed in 4.21 seconds
2024-10-15 19:01:53,811 - root - INFO - got prompt
2024-10-15 19:01:55,554 - root - INFO - Prompt executed in 1.73 seconds
2024-10-15 19:02:02,650 - root - INFO - got prompt
2024-10-15 19:02:04,400 - root - INFO - Prompt executed in 1.74 seconds
2024-10-15 19:02:15,253 - root - INFO - got prompt
2024-10-15 19:02:17,017 - root - INFO - Prompt executed in 1.75 seconds
2024-10-15 19:02:24,659 - root - INFO - got prompt
2024-10-15 19:02:26,440 - root - INFO - Prompt executed in 1.77 seconds
2024-10-15 19:02:37,626 - root - INFO - got prompt
2024-10-15 19:02:39,396 - root - INFO - Prompt executed in 1.76 seconds
2024-10-15 19:02:54,700 - root - INFO - got prompt
2024-10-15 19:02:54,717 - root - INFO - Requested to load FluxClipModel_
2024-10-15 19:02:54,717 - root - INFO - Loading 1 new model
2024-10-15 19:02:55,235 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 19:02:55,250 - root - ERROR - !!! Exception during processing !!! Allocation on device 
2024-10-15 19:02:55,256 - root - ERROR - Traceback (most recent call last):
  File "/workspace/ComfyUI/execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/workspace/ComfyUI/nodes.py", line 65, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
  File "/workspace/ComfyUI/comfy/sd.py", line 125, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "/workspace/ComfyUI/comfy/text_encoders/flux.py", line 59, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 229, in encode
    return self(tokens)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/text_encoders/t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/ops.py", line 211, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 152, in forward_comfy_cast_weights
    return self.forward_ggml_cast_weights(input, *args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 187, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 146, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 119, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py", line 23, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 23, in to
    new = super().to(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/_tensor.py", line 1437, in __torch_function__
    ret = func(*args, **kwargs)
torch.OutOfMemoryError: Allocation on device 

2024-10-15 19:02:55,259 - root - ERROR - Got an OOM, unloading all loaded models.
2024-10-15 19:02:57,615 - root - INFO - Prompt executed in 2.91 seconds
2024-10-15 19:03:01,228 - root - INFO - got prompt
2024-10-15 19:03:01,265 - root - INFO - Requested to load FluxClipModel_
2024-10-15 19:03:01,266 - root - INFO - Loading 1 new model
2024-10-15 19:03:01,687 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 19:03:01,854 - root - INFO - Requested to load Flux
2024-10-15 19:03:01,855 - root - INFO - Loading 1 new model
2024-10-15 19:03:03,512 - root - INFO - loaded completely 0.0 12099.961059570312 True
2024-10-15 19:03:05,067 - root - INFO - Requested to load AutoencodingEngine
2024-10-15 19:03:05,067 - root - INFO - Loading 1 new model
2024-10-15 19:03:05,094 - root - INFO - loaded completely 0.0 159.87335777282715 True
2024-10-15 19:03:05,302 - root - INFO - Prompt executed in 4.04 seconds
2024-10-15 19:03:14,582 - root - INFO - got prompt
2024-10-15 19:03:16,364 - root - INFO - Prompt executed in 1.77 seconds
2024-10-15 19:03:22,869 - root - INFO - got prompt
2024-10-15 19:03:24,627 - root - INFO - Prompt executed in 1.75 seconds
2024-10-15 19:03:34,347 - root - INFO - got prompt
2024-10-15 19:03:34,365 - root - INFO - Requested to load FluxClipModel_
2024-10-15 19:03:34,365 - root - INFO - Loading 1 new model
2024-10-15 19:03:34,784 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 19:03:34,799 - root - ERROR - !!! Exception during processing !!! Allocation on device 
2024-10-15 19:03:34,804 - root - ERROR - Traceback (most recent call last):
  File "/workspace/ComfyUI/execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/workspace/ComfyUI/nodes.py", line 65, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
  File "/workspace/ComfyUI/comfy/sd.py", line 125, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "/workspace/ComfyUI/comfy/text_encoders/flux.py", line 59, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 229, in encode
    return self(tokens)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/text_encoders/t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/ops.py", line 211, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 152, in forward_comfy_cast_weights
    return self.forward_ggml_cast_weights(input, *args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 187, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 146, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 119, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py", line 23, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 23, in to
    new = super().to(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/_tensor.py", line 1437, in __torch_function__
    ret = func(*args, **kwargs)
torch.OutOfMemoryError: Allocation on device 

2024-10-15 19:03:34,806 - root - ERROR - Got an OOM, unloading all loaded models.
2024-10-15 19:03:36,957 - root - INFO - Prompt executed in 2.60 seconds
2024-10-15 19:03:40,405 - root - INFO - got prompt
2024-10-15 19:03:40,428 - root - INFO - Requested to load FluxClipModel_
2024-10-15 19:03:40,429 - root - INFO - Loading 1 new model
2024-10-15 19:03:40,859 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 19:03:41,029 - root - INFO - Requested to load Flux
2024-10-15 19:03:41,030 - root - INFO - Loading 1 new model
2024-10-15 19:03:42,636 - root - INFO - loaded completely 0.0 12099.961059570312 True
2024-10-15 19:03:44,181 - root - INFO - Requested to load AutoencodingEngine
2024-10-15 19:03:44,181 - root - INFO - Loading 1 new model
2024-10-15 19:03:44,238 - root - INFO - loaded completely 0.0 159.87335777282715 True
2024-10-15 19:03:44,431 - root - INFO - Prompt executed in 4.01 seconds
2024-10-15 19:03:49,936 - root - INFO - got prompt
2024-10-15 19:03:51,700 - root - INFO - Prompt executed in 1.75 seconds
2024-10-15 19:04:03,688 - root - INFO - got prompt
2024-10-15 19:04:03,705 - root - INFO - Requested to load FluxClipModel_
2024-10-15 19:04:03,705 - root - INFO - Loading 1 new model
2024-10-15 19:04:04,410 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 19:04:04,425 - root - ERROR - !!! Exception during processing !!! Allocation on device 
2024-10-15 19:04:04,430 - root - ERROR - Traceback (most recent call last):
  File "/workspace/ComfyUI/execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/workspace/ComfyUI/nodes.py", line 65, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
  File "/workspace/ComfyUI/comfy/sd.py", line 125, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "/workspace/ComfyUI/comfy/text_encoders/flux.py", line 59, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 229, in encode
    return self(tokens)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/text_encoders/t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/ops.py", line 211, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 152, in forward_comfy_cast_weights
    return self.forward_ggml_cast_weights(input, *args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 187, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 146, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 119, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py", line 23, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 23, in to
    new = super().to(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/_tensor.py", line 1437, in __torch_function__
    ret = func(*args, **kwargs)
torch.OutOfMemoryError: Allocation on device 

2024-10-15 19:04:04,433 - root - ERROR - Got an OOM, unloading all loaded models.
2024-10-15 19:04:06,336 - root - INFO - Prompt executed in 2.63 seconds
2024-10-15 19:04:09,723 - root - INFO - got prompt
2024-10-15 19:04:09,737 - root - INFO - Requested to load FluxClipModel_
2024-10-15 19:04:09,737 - root - INFO - Loading 1 new model
2024-10-15 19:04:10,147 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 19:04:10,311 - root - INFO - Requested to load Flux
2024-10-15 19:04:10,312 - root - INFO - Loading 1 new model
2024-10-15 19:04:11,795 - root - INFO - loaded completely 0.0 12099.961059570312 True
2024-10-15 19:04:13,348 - root - INFO - Requested to load AutoencodingEngine
2024-10-15 19:04:13,349 - root - INFO - Loading 1 new model
2024-10-15 19:04:13,378 - root - INFO - loaded completely 0.0 159.87335777282715 True
2024-10-15 19:04:13,578 - root - INFO - Prompt executed in 3.84 seconds
2024-10-15 19:04:17,539 - root - INFO - got prompt
2024-10-15 19:04:19,335 - root - INFO - Prompt executed in 1.78 seconds
2024-10-15 19:04:23,699 - root - INFO - got prompt
2024-10-15 19:04:25,500 - root - INFO - Prompt executed in 1.79 seconds
2024-10-15 19:04:32,345 - root - INFO - got prompt
2024-10-15 19:04:34,135 - root - INFO - Prompt executed in 1.78 seconds
2024-10-15 19:04:38,334 - root - INFO - got prompt
2024-10-15 19:04:40,166 - root - INFO - Prompt executed in 1.82 seconds
2024-10-15 19:04:45,344 - root - INFO - got prompt
2024-10-15 19:04:47,113 - root - INFO - Prompt executed in 1.76 seconds
2024-10-15 19:04:51,646 - root - INFO - got prompt
2024-10-15 19:04:53,418 - root - INFO - Prompt executed in 1.76 seconds
2024-10-15 19:04:55,717 - root - INFO - got prompt
2024-10-15 19:04:57,538 - root - INFO - Prompt executed in 1.81 seconds
2024-10-15 19:05:07,540 - root - INFO - got prompt
2024-10-15 19:05:09,328 - root - INFO - Prompt executed in 1.78 seconds
2024-10-15 19:05:15,892 - root - INFO - got prompt
2024-10-15 19:05:17,668 - root - INFO - Prompt executed in 1.76 seconds
2024-10-15 19:05:38,901 - root - INFO - got prompt
2024-10-15 19:05:38,918 - root - INFO - Requested to load FluxClipModel_
2024-10-15 19:05:38,918 - root - INFO - Loading 1 new model
2024-10-15 19:05:40,110 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 19:05:40,124 - root - ERROR - !!! Exception during processing !!! Allocation on device 
2024-10-15 19:05:40,128 - root - ERROR - Traceback (most recent call last):
  File "/workspace/ComfyUI/execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/workspace/ComfyUI/nodes.py", line 65, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
  File "/workspace/ComfyUI/comfy/sd.py", line 125, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "/workspace/ComfyUI/comfy/text_encoders/flux.py", line 59, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 229, in encode
    return self(tokens)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/text_encoders/t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/ops.py", line 211, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 152, in forward_comfy_cast_weights
    return self.forward_ggml_cast_weights(input, *args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 187, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 146, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 119, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py", line 23, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 23, in to
    new = super().to(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/_tensor.py", line 1437, in __torch_function__
    ret = func(*args, **kwargs)
torch.OutOfMemoryError: Allocation on device 

2024-10-15 19:05:40,130 - root - ERROR - Got an OOM, unloading all loaded models.
2024-10-15 19:05:42,529 - root - INFO - Prompt executed in 3.62 seconds
2024-10-15 19:05:46,675 - root - INFO - got prompt
2024-10-15 19:05:46,689 - root - INFO - Requested to load FluxClipModel_
2024-10-15 19:05:46,689 - root - INFO - Loading 1 new model
2024-10-15 19:05:47,114 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 19:05:47,281 - root - INFO - Requested to load Flux
2024-10-15 19:05:47,282 - root - INFO - Loading 1 new model
2024-10-15 19:05:48,947 - root - INFO - loaded completely 0.0 12099.961059570312 True
2024-10-15 19:05:50,507 - root - INFO - Requested to load AutoencodingEngine
2024-10-15 19:05:50,507 - root - INFO - Loading 1 new model
2024-10-15 19:05:50,536 - root - INFO - loaded completely 0.0 159.87335777282715 True
2024-10-15 19:05:50,757 - root - INFO - Prompt executed in 4.07 seconds
2024-10-15 19:05:56,814 - root - INFO - got prompt
2024-10-15 19:05:58,592 - root - INFO - Prompt executed in 1.77 seconds
2024-10-15 19:06:01,360 - root - INFO - got prompt
2024-10-15 19:06:03,196 - root - INFO - Prompt executed in 1.83 seconds
2024-10-15 19:06:37,558 - root - INFO - got prompt
2024-10-15 19:06:37,572 - root - INFO - Requested to load FluxClipModel_
2024-10-15 19:06:37,572 - root - INFO - Loading 1 new model
2024-10-15 19:06:37,963 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 19:06:37,978 - root - ERROR - !!! Exception during processing !!! Allocation on device 
2024-10-15 19:06:37,983 - root - ERROR - Traceback (most recent call last):
  File "/workspace/ComfyUI/execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/workspace/ComfyUI/nodes.py", line 65, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
  File "/workspace/ComfyUI/comfy/sd.py", line 125, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "/workspace/ComfyUI/comfy/text_encoders/flux.py", line 59, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 229, in encode
    return self(tokens)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/text_encoders/t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/ops.py", line 211, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 152, in forward_comfy_cast_weights
    return self.forward_ggml_cast_weights(input, *args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 187, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 146, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 119, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py", line 23, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 23, in to
    new = super().to(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/_tensor.py", line 1437, in __torch_function__
    ret = func(*args, **kwargs)
torch.OutOfMemoryError: Allocation on device 

2024-10-15 19:06:37,985 - root - ERROR - Got an OOM, unloading all loaded models.
2024-10-15 19:06:40,145 - root - INFO - Prompt executed in 2.58 seconds
2024-10-15 19:06:42,671 - root - INFO - got prompt
2024-10-15 19:06:42,686 - root - INFO - Requested to load FluxClipModel_
2024-10-15 19:06:42,686 - root - INFO - Loading 1 new model
2024-10-15 19:06:43,065 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 19:06:43,234 - root - INFO - Requested to load Flux
2024-10-15 19:06:43,235 - root - INFO - Loading 1 new model
2024-10-15 19:06:45,020 - root - INFO - loaded completely 0.0 12099.961059570312 True
2024-10-15 19:06:46,571 - root - INFO - Requested to load AutoencodingEngine
2024-10-15 19:06:46,572 - root - INFO - Loading 1 new model
2024-10-15 19:06:46,603 - root - INFO - loaded completely 0.0 159.87335777282715 True
2024-10-15 19:06:46,809 - root - INFO - Prompt executed in 4.13 seconds
2024-10-15 19:06:51,229 - root - INFO - got prompt
2024-10-15 19:06:53,020 - root - INFO - Prompt executed in 1.78 seconds
2024-10-15 19:06:57,057 - root - INFO - got prompt
2024-10-15 19:06:58,873 - root - INFO - Prompt executed in 1.81 seconds
2024-10-15 19:10:06,384 - root - INFO - got prompt
2024-10-15 19:10:08,158 - root - INFO - Prompt executed in 1.76 seconds
2024-10-15 20:09:11,769 - root - INFO - got prompt
2024-10-15 20:09:13,548 - root - INFO - Prompt executed in 1.76 seconds
2024-10-15 20:09:27,885 - root - INFO - got prompt
2024-10-15 20:09:29,650 - root - INFO - Prompt executed in 1.75 seconds
2024-10-15 20:09:53,947 - root - INFO - got prompt
2024-10-15 20:09:55,727 - root - INFO - Prompt executed in 1.77 seconds
2024-10-15 20:10:18,289 - root - INFO - got prompt
2024-10-15 20:10:21,200 - root - INFO - Prompt executed in 2.90 seconds
2024-10-15 20:10:26,218 - root - INFO - got prompt
2024-10-15 20:10:29,144 - root - INFO - Prompt executed in 2.92 seconds
2024-10-15 20:10:40,864 - root - INFO - got prompt
2024-10-15 20:10:43,818 - root - INFO - Prompt executed in 2.94 seconds
2024-10-15 20:10:53,759 - root - INFO - got prompt
2024-10-15 20:10:56,703 - root - INFO - Prompt executed in 2.93 seconds
2024-10-15 20:11:33,009 - root - INFO - got prompt
2024-10-15 20:11:35,945 - root - INFO - Prompt executed in 2.91 seconds
2024-10-15 20:11:37,763 - root - INFO - got prompt
2024-10-15 20:11:40,699 - root - INFO - Prompt executed in 2.93 seconds
2024-10-15 20:11:50,443 - root - INFO - got prompt
2024-10-15 20:11:50,457 - root - INFO - Requested to load FluxClipModel_
2024-10-15 20:11:50,457 - root - INFO - Loading 1 new model
2024-10-15 20:11:51,625 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 20:11:51,640 - root - ERROR - !!! Exception during processing !!! Allocation on device 
2024-10-15 20:11:51,644 - root - ERROR - Traceback (most recent call last):
  File "/workspace/ComfyUI/execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/workspace/ComfyUI/nodes.py", line 65, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
  File "/workspace/ComfyUI/comfy/sd.py", line 125, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "/workspace/ComfyUI/comfy/text_encoders/flux.py", line 59, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 229, in encode
    return self(tokens)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/text_encoders/t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/ops.py", line 211, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 152, in forward_comfy_cast_weights
    return self.forward_ggml_cast_weights(input, *args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 187, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 146, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 119, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py", line 23, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 23, in to
    new = super().to(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/_tensor.py", line 1437, in __torch_function__
    ret = func(*args, **kwargs)
torch.OutOfMemoryError: Allocation on device 

2024-10-15 20:11:51,647 - root - ERROR - Got an OOM, unloading all loaded models.
2024-10-15 20:11:54,303 - root - INFO - Prompt executed in 3.85 seconds
2024-10-15 20:12:04,988 - root - INFO - got prompt
2024-10-15 20:12:05,007 - root - INFO - Requested to load FluxClipModel_
2024-10-15 20:12:05,008 - root - INFO - Loading 1 new model
2024-10-15 20:12:05,552 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 20:12:05,719 - root - INFO - Requested to load Flux
2024-10-15 20:12:05,719 - root - INFO - Loading 1 new model
2024-10-15 20:12:07,295 - root - INFO - loaded completely 0.0 12099.961059570312 True
2024-10-15 20:12:09,837 - root - INFO - Requested to load AutoencodingEngine
2024-10-15 20:12:09,838 - root - INFO - Loading 1 new model
2024-10-15 20:12:09,866 - root - INFO - loaded completely 0.0 159.87335777282715 True
2024-10-15 20:12:10,234 - root - INFO - Prompt executed in 5.23 seconds
2024-10-15 20:12:17,756 - root - INFO - got prompt
2024-10-15 20:12:17,772 - root - INFO - Requested to load FluxClipModel_
2024-10-15 20:12:17,773 - root - INFO - Loading 1 new model
2024-10-15 20:12:18,230 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 20:12:18,244 - root - ERROR - !!! Exception during processing !!! Allocation on device 
2024-10-15 20:12:18,248 - root - ERROR - Traceback (most recent call last):
  File "/workspace/ComfyUI/execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/workspace/ComfyUI/nodes.py", line 65, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
  File "/workspace/ComfyUI/comfy/sd.py", line 125, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "/workspace/ComfyUI/comfy/text_encoders/flux.py", line 59, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 229, in encode
    return self(tokens)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/text_encoders/t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/ops.py", line 211, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 152, in forward_comfy_cast_weights
    return self.forward_ggml_cast_weights(input, *args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 187, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 146, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 119, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py", line 23, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 23, in to
    new = super().to(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/_tensor.py", line 1437, in __torch_function__
    ret = func(*args, **kwargs)
torch.OutOfMemoryError: Allocation on device 

2024-10-15 20:12:18,251 - root - ERROR - Got an OOM, unloading all loaded models.
2024-10-15 20:12:20,278 - root - INFO - Prompt executed in 2.51 seconds
2024-10-15 20:19:49,394 - root - INFO - got prompt
2024-10-15 20:19:49,410 - root - INFO - Requested to load FluxClipModel_
2024-10-15 20:19:49,411 - root - INFO - Loading 1 new model
2024-10-15 20:19:49,862 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 20:19:50,031 - root - INFO - Requested to load Flux
2024-10-15 20:19:50,031 - root - INFO - Loading 1 new model
2024-10-15 20:19:51,786 - root - INFO - loaded completely 0.0 12099.961059570312 True
2024-10-15 20:19:54,308 - root - INFO - Requested to load AutoencodingEngine
2024-10-15 20:19:54,308 - root - INFO - Loading 1 new model
2024-10-15 20:19:54,336 - root - INFO - loaded completely 0.0 159.87335777282715 True
2024-10-15 20:19:54,694 - root - INFO - Prompt executed in 5.29 seconds
2024-10-15 20:20:00,460 - root - INFO - got prompt
2024-10-15 20:20:03,370 - root - INFO - Prompt executed in 2.90 seconds
2024-10-15 20:20:17,158 - root - INFO - got prompt
2024-10-15 20:20:20,083 - root - INFO - Prompt executed in 2.91 seconds
2024-10-15 20:20:32,943 - root - INFO - got prompt
2024-10-15 20:20:39,714 - root - INFO - Prompt executed in 6.76 seconds
2024-10-15 20:20:49,453 - root - INFO - got prompt
2024-10-15 20:20:52,403 - root - INFO - Prompt executed in 2.94 seconds
2024-10-15 20:21:08,416 - root - INFO - got prompt
2024-10-15 20:21:11,347 - root - INFO - Prompt executed in 2.92 seconds
2024-10-15 20:21:31,028 - root - INFO - got prompt
2024-10-15 20:21:34,991 - root - INFO - Prompt executed in 3.95 seconds
2024-10-15 20:21:43,935 - root - INFO - got prompt
2024-10-15 20:21:46,853 - root - INFO - Prompt executed in 2.91 seconds
2024-10-15 20:21:54,293 - root - INFO - got prompt
2024-10-15 20:21:57,185 - root - INFO - Prompt executed in 2.88 seconds
2024-10-15 20:22:25,668 - root - INFO - got prompt
2024-10-15 20:22:25,715 - root - INFO - Requested to load FluxClipModel_
2024-10-15 20:22:25,715 - root - INFO - Loading 1 new model
2024-10-15 20:22:26,833 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 20:22:26,848 - root - ERROR - !!! Exception during processing !!! Allocation on device 
2024-10-15 20:22:26,852 - root - ERROR - Traceback (most recent call last):
  File "/workspace/ComfyUI/execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/workspace/ComfyUI/nodes.py", line 65, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
  File "/workspace/ComfyUI/comfy/sd.py", line 125, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "/workspace/ComfyUI/comfy/text_encoders/flux.py", line 59, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 229, in encode
    return self(tokens)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/text_encoders/t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/ops.py", line 211, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 152, in forward_comfy_cast_weights
    return self.forward_ggml_cast_weights(input, *args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 187, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 146, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 119, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py", line 23, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 23, in to
    new = super().to(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/_tensor.py", line 1437, in __torch_function__
    ret = func(*args, **kwargs)
torch.OutOfMemoryError: Allocation on device 

2024-10-15 20:22:26,854 - root - ERROR - Got an OOM, unloading all loaded models.
2024-10-15 20:22:29,477 - root - INFO - Prompt executed in 3.80 seconds
2024-10-15 20:22:32,171 - root - INFO - got prompt
2024-10-15 20:22:32,198 - root - INFO - Requested to load FluxClipModel_
2024-10-15 20:22:32,199 - root - INFO - Loading 1 new model
2024-10-15 20:22:32,636 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 20:22:32,940 - root - INFO - Requested to load Flux
2024-10-15 20:22:32,940 - root - INFO - Loading 1 new model
2024-10-15 20:22:34,761 - root - INFO - loaded completely 0.0 12099.961059570312 True
2024-10-15 20:22:37,300 - root - INFO - Requested to load AutoencodingEngine
2024-10-15 20:22:37,300 - root - INFO - Loading 1 new model
2024-10-15 20:22:37,327 - root - INFO - loaded completely 0.0 159.87335777282715 True
2024-10-15 20:22:37,680 - root - INFO - Prompt executed in 5.50 seconds
2024-10-15 20:23:14,349 - root - INFO - got prompt
2024-10-15 20:23:17,251 - root - INFO - Prompt executed in 2.89 seconds
2024-10-15 20:23:22,653 - root - INFO - got prompt
2024-10-15 20:23:25,577 - root - INFO - Prompt executed in 2.91 seconds
2024-10-15 20:23:31,967 - root - INFO - got prompt
2024-10-15 20:23:31,992 - root - INFO - Requested to load FluxClipModel_
2024-10-15 20:23:31,993 - root - INFO - Loading 1 new model
2024-10-15 20:23:32,768 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 20:23:32,783 - root - ERROR - !!! Exception during processing !!! Allocation on device 
2024-10-15 20:23:32,786 - root - ERROR - Traceback (most recent call last):
  File "/workspace/ComfyUI/execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/workspace/ComfyUI/nodes.py", line 65, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
  File "/workspace/ComfyUI/comfy/sd.py", line 125, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "/workspace/ComfyUI/comfy/text_encoders/flux.py", line 59, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 229, in encode
    return self(tokens)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/text_encoders/t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/ops.py", line 211, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 152, in forward_comfy_cast_weights
    return self.forward_ggml_cast_weights(input, *args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 187, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 146, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 119, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py", line 23, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 23, in to
    new = super().to(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/_tensor.py", line 1437, in __torch_function__
    ret = func(*args, **kwargs)
torch.OutOfMemoryError: Allocation on device 

2024-10-15 20:23:32,788 - root - ERROR - Got an OOM, unloading all loaded models.
2024-10-15 20:23:34,910 - root - INFO - Prompt executed in 2.93 seconds
2024-10-15 20:24:27,861 - root - INFO - got prompt
2024-10-15 20:24:27,887 - root - INFO - Requested to load FluxClipModel_
2024-10-15 20:24:27,887 - root - INFO - Loading 1 new model
2024-10-15 20:24:28,319 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 20:24:28,619 - root - INFO - Requested to load Flux
2024-10-15 20:24:28,619 - root - INFO - Loading 1 new model
2024-10-15 20:24:30,358 - root - INFO - loaded completely 0.0 12099.961059570312 True
2024-10-15 20:24:32,888 - root - INFO - Requested to load AutoencodingEngine
2024-10-15 20:24:32,888 - root - INFO - Loading 1 new model
2024-10-15 20:24:32,918 - root - INFO - loaded completely 0.0 159.87335777282715 True
2024-10-15 20:24:33,294 - root - INFO - Prompt executed in 5.42 seconds
2024-10-15 20:24:48,094 - root - INFO - got prompt
2024-10-15 20:24:51,032 - root - INFO - Prompt executed in 2.93 seconds
2024-10-15 20:25:24,535 - root - INFO - got prompt
2024-10-15 20:25:27,436 - root - INFO - Prompt executed in 2.89 seconds
2024-10-15 20:25:31,078 - root - INFO - got prompt
2024-10-15 20:25:33,995 - root - INFO - Prompt executed in 2.91 seconds
2024-10-15 20:25:54,309 - root - INFO - got prompt
2024-10-15 20:25:54,346 - root - INFO - Requested to load FluxClipModel_
2024-10-15 20:25:54,347 - root - INFO - Loading 1 new model
2024-10-15 20:25:54,791 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 20:25:54,807 - root - ERROR - !!! Exception during processing !!! Allocation on device 
2024-10-15 20:25:54,812 - root - ERROR - Traceback (most recent call last):
  File "/workspace/ComfyUI/execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/workspace/ComfyUI/nodes.py", line 65, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
  File "/workspace/ComfyUI/comfy/sd.py", line 125, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "/workspace/ComfyUI/comfy/text_encoders/flux.py", line 59, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 229, in encode
    return self(tokens)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/text_encoders/t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/ops.py", line 211, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 152, in forward_comfy_cast_weights
    return self.forward_ggml_cast_weights(input, *args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 187, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 146, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 119, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py", line 23, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 23, in to
    new = super().to(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/_tensor.py", line 1437, in __torch_function__
    ret = func(*args, **kwargs)
torch.OutOfMemoryError: Allocation on device 

2024-10-15 20:25:54,815 - root - ERROR - Got an OOM, unloading all loaded models.
2024-10-15 20:25:57,231 - root - INFO - Prompt executed in 2.91 seconds
2024-10-15 20:26:00,630 - root - INFO - got prompt
2024-10-15 20:26:00,657 - root - INFO - Requested to load FluxClipModel_
2024-10-15 20:26:00,658 - root - INFO - Loading 1 new model
2024-10-15 20:26:01,061 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 20:26:01,365 - root - INFO - Requested to load Flux
2024-10-15 20:26:01,365 - root - INFO - Loading 1 new model
2024-10-15 20:26:03,047 - root - INFO - loaded completely 0.0 12099.961059570312 True
2024-10-15 20:26:05,592 - root - INFO - Requested to load AutoencodingEngine
2024-10-15 20:26:05,592 - root - INFO - Loading 1 new model
2024-10-15 20:26:05,623 - root - INFO - loaded completely 0.0 159.87335777282715 True
2024-10-15 20:26:05,989 - root - INFO - Prompt executed in 5.35 seconds
2024-10-15 20:26:35,732 - root - INFO - got prompt
2024-10-15 20:26:35,759 - root - INFO - Requested to load FluxClipModel_
2024-10-15 20:26:35,759 - root - INFO - Loading 1 new model
2024-10-15 20:26:36,199 - root - INFO - loaded completely 0.0 5062.70263671875 True
2024-10-15 20:26:36,214 - root - ERROR - !!! Exception during processing !!! Allocation on device 
2024-10-15 20:26:36,218 - root - ERROR - Traceback (most recent call last):
  File "/workspace/ComfyUI/execution.py", line 323, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 198, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/workspace/ComfyUI/execution.py", line 169, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/workspace/ComfyUI/execution.py", line 158, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/workspace/ComfyUI/nodes.py", line 65, in encode
    output = clip.encode_from_tokens(tokens, return_pooled=True, return_dict=True)
  File "/workspace/ComfyUI/comfy/sd.py", line 125, in encode_from_tokens
    o = self.cond_stage_model.encode_token_weights(tokens)
  File "/workspace/ComfyUI/comfy/text_encoders/flux.py", line 59, in encode_token_weights
    t5_out, t5_pooled = self.t5xxl.encode_token_weights(token_weight_pairs_t5)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 41, in encode_token_weights
    o = self.encode(to_encode)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 229, in encode
    return self(tokens)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/sd1_clip.py", line 201, in forward
    outputs = self.transformer(tokens, attention_mask_model, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state, dtype=torch.float32)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/text_encoders/t5.py", line 238, in forward
    x = self.shared(input_ids, out_dtype=kwargs.get("dtype", torch.float32))
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1553, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1562, in _call_impl
    return forward_call(*args, **kwargs)
  File "/workspace/ComfyUI/comfy/ops.py", line 211, in forward
    return self.forward_comfy_cast_weights(*args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 152, in forward_comfy_cast_weights
    return self.forward_ggml_cast_weights(input, *args, **kwargs)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 187, in forward_ggml_cast_weights
    weight, _bias = self.cast_bias_weight(self, device=input.device, dtype=out_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 146, in cast_bias_weight
    weight = s.get_weight(s.weight.to(device), dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 119, in get_weight
    weight = dequantize_tensor(tensor, dtype, self.dequant_dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/dequant.py", line 23, in dequantize_tensor
    return dequantize(tensor.data, qtype, oshape, dtype=dequant_dtype).to(dtype)
  File "/workspace/ComfyUI/custom_nodes/ComfyUI-GGUF/ops.py", line 23, in to
    new = super().to(*args, **kwargs)
  File "/opt/environments/python/comfyui/lib/python3.10/site-packages/torch/_tensor.py", line 1437, in __torch_function__
    ret = func(*args, **kwargs)
torch.OutOfMemoryError: Allocation on device 

2024-10-15 20:26:36,220 - root - ERROR - Got an OOM, unloading all loaded models.
2024-10-15 20:26:38,249 - root - INFO - Prompt executed in 2.51 seconds

Attached Workflow

Please make sure that workflow does not contain any sensitive information such as API keys or passwords.

{"last_node_id":12,"last_link_id":13,"nodes":[{"id":12,"type":"VAELoader","pos":{"0":862,"1":504},"size":{"0":315,"1":58},"flags":{},"order":0,"mode":0,"inputs":[],"outputs":[{"name":"VAE","type":"VAE","links":[13],"shape":3}],"properties":{"Node name for S&R":"VAELoader"},"widgets_values":["ae.safetensors"]},{"id":8,"type":"VAEDecode","pos":{"0":904,"1":55},"size":{"0":210,"1":46},"flags":{},"order":7,"mode":0,"inputs":[{"name":"samples","type":"LATENT","link":7},{"name":"vae","type":"VAE","link":13}],"outputs":[{"name":"IMAGE","type":"IMAGE","links":[9],"slot_index":0}],"properties":{"Node name for S&R":"VAEDecode"}},{"id":7,"type":"CLIPTextEncode","pos":{"0":380,"1":514},"size":{"0":427.3594970703125,"1":76},"flags":{},"order":5,"mode":0,"inputs":[{"name":"clip","type":"CLIP","link":12}],"outputs":[{"name":"CONDITIONING","type":"CONDITIONING","links":[6],"slot_index":0}],"properties":{"Node name for S&R":"CLIPTextEncode"},"widgets_values":[""]},{"id":10,"type":"UnetLoaderGGUF","pos":{"0":480,"1":193},"size":{"0":315,"1":58},"flags":{},"order":1,"mode":0,"inputs":[],"outputs":[{"name":"MODEL","type":"MODEL","links":[10],"shape":3}],"properties":{"Node name for S&R":"UnetLoaderGGUF"},"widgets_values":["flux1-schnell-Q8_0.gguf"]},{"id":11,"type":"DualCLIPLoaderGGUF","pos":{"0":485,"1":38},"size":{"0":315,"1":106},"flags":{},"order":2,"mode":0,"inputs":[],"outputs":[{"name":"CLIP","type":"CLIP","links":[11,12],"slot_index":0,"shape":3}],"properties":{"Node name for S&R":"DualCLIPLoaderGGUF"},"widgets_values":["clip_l.safetensors","t5-v1_1-xxl-encoder-Q8_0.gguf","flux"]},{"id":9,"type":"SaveImage","pos":{"0":1202,"1":56},"size":[767.8958005578725,813.7210937499999],"flags":{},"order":8,"mode":0,"inputs":[{"name":"images","type":"IMAGE","link":9}],"outputs":[],"properties":{},"widgets_values":["ComfyUI"]},{"id":5,"type":"EmptyLatentImage","pos":{"0":494,"1":646},"size":{"0":315,"1":106},"flags":{},"order":3,"mode":0,"inputs":[],"outputs":[{"name":"LATENT","type":"LATENT","links":[2],"slot_index":0}],"properties":{"Node name for S&R":"EmptyLatentImage"},"widgets_values":[1024,1024,1]},{"id":3,"type":"KSampler","pos":{"0":863,"1":186},"size":{"0":315,"1":262},"flags":{},"order":6,"mode":0,"inputs":[{"name":"model","type":"MODEL","link":10},{"name":"positive","type":"CONDITIONING","link":4},{"name":"negative","type":"CONDITIONING","link":6},{"name":"latent_image","type":"LATENT","link":2}],"outputs":[{"name":"LATENT","type":"LATENT","links":[7],"slot_index":0}],"properties":{"Node name for S&R":"KSampler"},"widgets_values":[224070116344688,"randomize",4,1,"dpmpp_2m","simple",1]},{"id":6,"type":"CLIPTextEncode","pos":{"0":400,"1":308},"size":{"0":421.8337707519531,"1":133.60507202148438},"flags":{},"order":4,"mode":0,"inputs":[{"name":"clip","type":"CLIP","link":11}],"outputs":[{"name":"CONDITIONING","type":"CONDITIONING","links":[4],"slot_index":0}],"properties":{"Node name for S&R":"CLIPTextEncode"},"widgets_values":["(masterpiece), best quality, expressive eyes, perfect face, , sam yang, (Joe Madureira:0.8), (Extremely detailed), (Intricate details), Beautiful girl, tall and slender fit body, large breasts, shiny pale oiled skin, (Very long lush shiny straight (((iridescent white))) hair), (Extremely detailed beautiful vivid pale grey eyes), (Perfect make-up, mascara, eyeliner, blush), (Very long very fluffy fake lashes), (Delicate face features), ((Very long sharp shiny black nails)), (Intricate massive jewelry), Very large earloops earrings, black brows, Shiny glossy lips, (Cowboy shot), (Shiny navel piercing), real, hyperrealisim, super realistic skin texture,"]}],"links":[[2,5,0,3,3,"LATENT"],[4,6,0,3,1,"CONDITIONING"],[6,7,0,3,2,"CONDITIONING"],[7,3,0,8,0,"LATENT"],[9,8,0,9,0,"IMAGE"],[10,10,0,3,0,"MODEL"],[11,11,0,6,0,"CLIP"],[12,11,0,7,0,"CLIP"],[13,12,0,8,1,"VAE"]],"groups":[],"config":{},"extra":{"ds":{"scale":1.01,"offset":[-346.0182823003465,4.376936704119686]}},"version":0.4}

Oct 15 '24 20:10 pribeh

Same problem with first run of every flux workflow. The second and following runs are succesfull.

Nov 23 '24 09:11 johnnykorm

same here, however it happens since I'm using "Ollama Generate" node, the Vram go 99%/100% while processing the "Clip Text Encode" node

Nov 26 '24 09:11 MaraScott