ComfyUI
ComfyUI copied to clipboard
HIP error: invalid device function when running ComfyUI
I'm on Arch Linux 6.7.4-arch1-1 but also using a Python virtual environment to run ComfyUI. My GPU is a Radeon RX 5700 XT and my CPU is a Ryzen 5 3600.
HIP error: invalid device function
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
File "/opt/ComfyUI/execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/nodes.py", line 56, in encode
cond, pooled = clip.encode_from_tokens(tokens, return_pooled=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/sd.py", line 128, in encode_from_tokens
cond, pooled = self.cond_stage_model.encode_token_weights(tokens)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/sd1_clip.py", line 514, in encode_token_weights
out, pooled = getattr(self, self.clip).encode_token_weights(token_weight_pairs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/sd1_clip.py", line 39, in encode_token_weights
out, pooled = self.encode(to_encode)
^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/sd1_clip.py", line 190, in encode
return self(tokens)
^^^^^^^^^^^^
File "/opt/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/sd1_clip.py", line 172, in forward
outputs = self.transformer(tokens, attention_mask, intermediate_output=self.layer_idx, final_layer_norm_intermediate=self.layer_norm_hidden_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/clip_model.py", line 131, in forward
return self.text_model(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/clip_model.py", line 97, in forward
x = self.embeddings(input_tokens)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/clip_model.py", line 80, in forward
return self.token_embedding(input_tokens) + self.position_embedding.weight
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1529, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1538, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/modules/sparse.py", line 163, in forward
return F.embedding(
^^^^^^^^^^^^
File "/opt/ComfyUI/venv/lib/python3.11/site-packages/torch/nn/functional.py", line 2264, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Hi, did you solve this?
Hi, did you solve this?
Sadly no.
Hi, did you solve this?
Sadly no.
I managed to install it on my machine using some tutorials on the internet. Would you like to see them? Maybe this will help you.
Why not! :)
Did you set the environment variable HSA_OVERRIDE_GFX_VERSION=10.3.0
in order to override the gpu target for ROCm? 5700XT is not officially supported but you can try to make it work in that fashion.
See also ComfyUI launch instructions:
Here's what I got after setting this environment variable up.
Error occurred when executing CheckpointLoaderSimple:
HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3.
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
File "/opt/ComfyUI/execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/nodes.py", line 552, in load_checkpoint
out = comfy.sd.load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, embedding_directory=folder_paths.get_folder_paths("embeddings"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/sd.py", line 461, in load_checkpoint_guess_config
model = model_config.get_model(sd, "model.diffusion_model.", device=inital_load_device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/supported_models_base.py", line 51, in get_model
out = model_base.BaseModel(self, model_type=self.model_type(state_dict, prefix), device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/model_base.py", line 51, in __init__
self.diffusion_model = UNetModel(**unet_config, device=device, operations=operations)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 807, in __init__
zero_module(operations.conv_nd(dims, model_channels, out_channels, 3, padding=1, dtype=self.dtype, device=device)),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/ldm/modules/diffusionmodules/util.py", line 255, in zero_module
p.detach().zero_()
Just to check, which version of pytorch are you running?
pip show torch
querying the ROCm issues also shows some other environment variables that might work in that conjunction. But no guarantee
PYTORCH_ROCM_ARCH="gfx1031"
HSA_OVERRIDE_GFX_VERSION=10.3.1
HIP_VISIBLE_DEVICES=0
ROCM_PATH=/opt/rocm
pip show torch
=>
Name: torch
Version: 2.3.0.dev20240219+rocm6.0
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: [email protected]
License: BSD-3
Location: /opt/ComfyUI/venv/lib/python3.11/site-packages
Requires: filelock, fsspec, jinja2, networkx, pytorch-triton-rocm, sympy, typing-extensions
Required-by: torchaudio, torchsde, torchvision
If you also need my rocminfo
:
=====================
HSA System Attributes
=====================
Runtime Version: 1.1
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
==========
HSA Agents
==========
*******
Agent 1
*******
Name: AMD Ryzen 5 3600 6-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen 5 3600 6-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 3600
BDFID: 0
Internal Node ID: 0
Compute Unit: 12
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Features: None
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: FINE GRAINED
Size: 16320652(0xf9088c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 2
Segment: GLOBAL; FLAGS: KERNARG, FINE GRAINED
Size: 16320652(0xf9088c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
Pool 3
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 16320652(0xf9088c) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: TRUE
ISA Info:
*******
Agent 2
*******
Name: gfx1010
Uuid: GPU-XX
Marketing Name: AMD Radeon RX 5700 XT
Vendor Name: AMD
Feature: KERNEL_DISPATCH
Profile: BASE_PROFILE
Float Round Mode: NEAR
Max Queue Number: 128(0x80)
Queue Min Size: 64(0x40)
Queue Max Size: 131072(0x20000)
Queue Type: MULTI
Node: 1
Device Type: GPU
Cache Info:
L1: 16(0x10) KB
L2: 4096(0x1000) KB
Chip ID: 29471(0x731f)
ASIC Revision: 2(0x2)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 2100
BDFID: 10240
Internal Node ID: 1
Compute Unit: 40
SIMDs per CU: 2
Shader Engines: 2
Shader Arrs. per Eng.: 2
WatchPts on Addr. Ranges:4
Features: KERNEL_DISPATCH
Fast F16 Operation: TRUE
Wavefront Size: 32(0x20)
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Max Waves Per CU: 40(0x28)
Max Work-item Per CU: 1280(0x500)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
Max fbarriers/Workgrp: 32
Pool Info:
Pool 1
Segment: GLOBAL; FLAGS: COARSE GRAINED
Size: 8372224(0x7fc000) KB
Allocatable: TRUE
Alloc Granule: 4KB
Alloc Alignment: 4KB
Accessible by all: FALSE
Pool 2
Segment: GROUP
Size: 64(0x40) KB
Allocatable: FALSE
Alloc Granule: 0KB
Alloc Alignment: 0KB
Accessible by all: FALSE
ISA Info:
ISA 1
Name: amdgcn-amd-amdhsa--gfx1010:xnack-
Machine Models: HSA_MACHINE_MODEL_LARGE
Profiles: HSA_PROFILE_BASE
Default Rounding Mode: NEAR
Default Rounding Mode: NEAR
Fast f16: TRUE
Workgroup Max Size: 1024(0x400)
Workgroup Max Size per Dimension:
x 1024(0x400)
y 1024(0x400)
z 1024(0x400)
Grid Max Size: 4294967295(0xffffffff)
Grid Max Size per Dimension:
x 4294967295(0xffffffff)
y 4294967295(0xffffffff)
z 4294967295(0xffffffff)
FBarrier Max Size: 32
*** Done ***
Have you tried downgrading to pytorch 2.2 with rocm5.7? You can run this even with ROCm 6.0 binaries installed on the host.
pip uninstall torch torchvision torchaudio
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm5.7
It's now even worse. Clicking on the Queue button doesn't show anything so I have to watch my command console to get the error.
ERROR:root:!!! Exception during processing !!!
ERROR:root:Traceback (most recent call last):
File "/opt/ComfyUI/execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/nodes.py", line 552, in load_checkpoint
out = comfy.sd.load_checkpoint_guess_config(ckpt_path, output_vae=True, output_clip=True, embedding_directory=folder_paths.get_folder_paths("embeddings"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/sd.py", line 461, in load_checkpoint_guess_config
model = model_config.get_model(sd, "model.diffusion_model.", device=inital_load_device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/supported_models_base.py", line 51, in get_model
out = model_base.BaseModel(self, model_type=self.model_type(state_dict, prefix), device=device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/model_base.py", line 51, in __init__
self.diffusion_model = UNetModel(**unet_config, device=device, operations=operations)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 807, in __init__
zero_module(operations.conv_nd(dims, model_channels, out_channels, 3, padding=1, dtype=self.dtype, device=device)),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ComfyUI/comfy/ldm/modules/diffusionmodules/util.py", line 255, in zero_module
p.detach().zero_()
RuntimeError: HIP error: invalid device function
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing HIP_LAUNCH_BLOCKING=1.
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
Prompt executed in 0.06 seconds
No idea?
did you find a solution ? I running into the same issue
Sadly not...
To my knowledge the problem lies within the fact that current torch releases are compiled with hardware features that RDNA1 does not support. You might have to sequentially downgrade to a pytorch version that was compiled with support for a gfx1010 hardware target or compile your own. If I remember correctly it was some release of torch 2.0 which has been compiled against ROCm 5.3. or 5.2.
Of course this might break support on other ends. But it might be worth a try.
The override variable basically tells the backend that you have a different GPU that it actually is in order to allow it to run. But if the pytorch package now tries to use functions that are not supported it will error out. To the detriment of ML usability AMD has made a lot of changes to the hardware since Polaris when compared to the CUDA stack of Nvidia which is a lot more mature due to it being around a lot longer.
I definitely recall running an early version of Stable Diffusion on my old 5700XT a long time ago.
I will try to build Pytorch myself in order to avoid any other compatibility issues. Just in the sake of curiosity, why aren't 5700XT (gfx1010) no longer supported anyways? It's definitely not that old of a graphic card.
EDIT: Tried building myself after installing dependencies, creating a new Python virtual environment and a 20Gb swapfile to prevent the crapton of linking to freeze my computer. But I got this error output. log.txt
Made it work using this
Okay, because building Pytorch was tedious and RAM-greedy, I installed the binaries from the ROCm5.3 PyTorch repositories and it apparently works... I mean, the interface is rendered, and the error window doesn't longer appear. But I noticed it couldn't initialize NVML and it weirdly considers my Radeon to be a CUDA thing, I guess... ? I still cannot generate anything because the queue seems to get stuck on the Clip Text Encode node (it's highlighted in green).
Total VRAM 8176 MB, total RAM 15938 MB
Set vram state to: NORMAL_VRAM
/opt/pytorch-gfx1010-venv/lib/python3.11/site-packages/torch/cuda/__init__.py:546: UserWarning: Can't initialize NVML
warnings.warn("Can't initialize NVML")
Device: cuda:0 AMD Radeon RX 5700 XT : native
VAE dtype: torch.float32
Using sub quadratic optimization for cross attention, if you have memory or speed issues try using: --use-split-cross-attention
Starting server
To see the GUI go to: http://127.0.0.1:8188
got prompt
model_type EPS
adm 0
Using split attention in VAE
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
Using split attention in VAE
missing {'cond_stage_model.clip_l.text_projection', 'cond_stage_model.clip_l.logit_scale'}
left over keys: dict_keys(['cond_stage_model.clip_l.transformer.text_model.embeddings.position_ids'])
Requested to load SD1ClipModel
Loading 1 new model
@ddeityy It didn't work for me. My Python env says it's "not compatible with this platform".
No more ideas? :(
same, following
RDNA1 is a tragedy for its ROCm support. The best solution is to buy the new card or compile the ROCm libraries by yourself.
For Debian user, the support for gfx1010
is enabled in Trixie's libraries (by Debian community), but Trixie is a unstable distribution.
https://salsa.debian.org/rocm-team/community/team-project/-/wikis/Supported-GPU-list#gfx1010
To use custom AMD ROCm libraries, it's better to compile PyTorch on your own. The new PyTorch for ROCm already includes the official ROCm library, it's might be affected if you are trying your own ROCm libraries.
I have no RX5700 card so I can't help you. But the tips might could help you. Cheer up! :mechanical_arm:
You can try some docker images, it might be help! :smiley:
Thanks for the tips, I'll try my best.
Thank you all. This is a very helpful thread. I managed to run my script on my AMD Radeon RX 7700S on EndeavourOS (Arch) by running:
HSA_OVERRIDE_GFX_VERSION=11.0.0 python script.py
My rocminfo and script are attached. Feel free to use my script to test performance difference between CPU and GPU with torch.
I can't run comfyUI with HSA_OVERRIDE_GFX_VERSION
, a1111 works fine with it
comfyui-rocm | [Crystools INFO] CPU: AMD Ryzen 7 7700 8-Core Processor - Arch: x86_64 - OS: Linux 6.1.0-18-amd64
comfyui-rocm | torch.cuda.OutOfMemoryError: HIP out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 512.00 MiB of which 17179869183.98 GiB is free. Of the allocated memory 414.15 MiB is allocated by PyTorch, and 1.85 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_HIP_ALLOC_CONF
comfyui-rocm | torch.cuda.OutOfMemoryError: HIP out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacty of 512.00 MiB of which 17179869183.98 GiB is free. Of the allocated memory 414.15 MiB is allocated by PyTorch, and 1.85 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_HIP_ALLOC_CONF
It's saying torch.cuda.OutOfMemoryError
, the only reason might be your model is too big to load on your video card memory.
It's saying
torch.cuda.OutOfMemoryError
, the only reason might be your model is too big to load on your video card memory.
OK, I'll try to investigate more, I just tried with the default workflow, with a1111 I can run fine SDXL
@supersonictw Thanks, i tried SDXL with --lowvram
it pass Ksampler but then it fails at VAE Decode like this https://github.com/comfyanonymous/ComfyUI/issues/2431