ComfyUI Screen freezes when running ksampler

Expected Behavior

run comfy

Actual Behavior

Screen freezes while mouse still available.

Gpu fans didn't even spin at first but now is hot and blowing I think it is running something, but the entire gui is not functioning.

2nd time this happens on a second install for a machine that fails on every level.

Steps to Reproduce

/

Debug Logs

Other

I will let it run for now see if it finishes.

Mar 06 '25 17:03 BrechtCorbeel

Usually everything crashes now, I switched to another 7900xtx yet same issue once I hit ksampler it is only a matter of time before it crashes my whole system.

The whole hardware is the same except the MOBO on this device the workflow I run is identical to one I run on another device with the same hardware specs.

Mar 06 '25 18:03 BrechtCorbeel

I started lowering the res for some reason 1200 x 800 can cause issues 500x500 runs fine, but even that when VAE encoding makes my 24gb run out on a model I have never had issues with.

Mar 06 '25 18:03 BrechtCorbeel

This thing is insane, I have a 500x500 px latent image running on ksampler speeding over 24GB on VRAM from 0-100 in less then a second.

Mar 06 '25 18:03 BrechtCorbeel

I have this issue, when ksampler starts running, computer slows down to the point it's unresponsive. If i exit or kill the server, computer returns to normal. I have a 3090, 64gb RAM, a beefy rizen. I use comfyui portable

Mar 21 '25 19:03 cederron

Ok in my case i fixed it. Check wich cuda version your comfyui needs based on pytorch being used. Use command:

python -c "import torch; print(torch.version.cuda)"

It will return the required cuda version, then install it.

If you are using portable comfy like me, you have tu run command with python inside portable version.

Mar 21 '25 23:03 cederron

Update2:

After a series of tests, it was found that the driver version was incorrect, and the required driver version was: https://www.amd.com/en/resources/support-articles/release-notes/RN-AMDGPU-WINDOWS-PYTORCH-7-1-1.html

My situation is somewhat similar to yours, with a 9070 (16GB vRAM + 64GB RAM). And I'm using ComfyUI_windows_portable_amd.7z (0.4.0) with AMD Adrenalin 25.12.1. However, my mouse initially worked, but after a while, it stopped moving, and the entire system completely froze. But during multiple attempts, the graphics driver crashed once, and then ComfyUI output the following log:


E:\AI\ComfyUI>SET http_proxy=http://127.0.0.1:10800

E:\AI\ComfyUI>SET https_proxy=http://127.0.0.1:10800

E:\AI\ComfyUI>.\python_embeded\python.exe -s ComfyUI\main.py --windows-standalone-build
Adding extra search path checkpoints E:\AI\StableDiffusion\userspace\models\Stable-diffusion
[WARNING] failed to run amdgpu-arch: binary not found.
Checkpoint files will always be loaded safely.
Total VRAM 16304 MB, total RAM 65462 MB
pytorch version: 2.9.0+rocmsdk20251116
Set: torch.backends.cudnn.enabled = False for better AMD performance.
AMD arch: gfx1201
ROCm version: (7, 1)
Set vram state to: NORMAL_VRAM
Device: cuda:0 AMD Radeon RX 9070 : native
Enabled pinned memory 29457.0
Using sub quadratic optimization for attention, if you have memory or speed issues try using: --use-split-cross-attention
Python version: 3.12.10 (tags/v3.12.10:0cc8128, Apr  8 2025, 12:21:36) [MSC v.1943 64 bit (AMD64)]
ComfyUI version: 0.4.0
ComfyUI frontend version: 1.33.13
[Prompt Server] web root: E:\AI\ComfyUI\python_embeded\Lib\site-packages\comfyui_frontend_package\static
Total VRAM 16304 MB, total RAM 65462 MB
pytorch version: 2.9.0+rocmsdk20251116
Set: torch.backends.cudnn.enabled = False for better AMD performance.
AMD arch: gfx1201
ROCm version: (7, 1)
Set vram state to: NORMAL_VRAM
Device: cuda:0 AMD Radeon RX 9070 : native
Enabled pinned memory 29457.0

Import times for custom nodes:
   0.0 seconds: E:\AI\ComfyUI\ComfyUI\custom_nodes\websocket_image_save.py

Context impl SQLiteImpl.
Will assume non-transactional DDL.
No target revision found.
Starting server

To see the GUI go to: http://127.0.0.1:8188
got prompt
model weight dtype torch.float16, manual cast: None
model_type EPS
Using split attention in VAE
Using split attention in VAE
VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
Requested to load SD1ClipModel
loaded completely; 14633.80 MB usable, 235.84 MB loaded, full load: True
Requested to load BaseModel
loaded completely; 14266.08 MB usable, 1639.41 MB loaded, full load: True
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [00:22<00:00,  1.14s/it]
Requested to load AutoencoderKL
loaded completely; 10368.12 MB usable, 159.56 MB loaded, full load: True
E:\AI\ComfyUI\python_embeded\Lib\site-packages\torch\nn\modules\conv.py:543: UserWarning: bgemm_internal_cublaslt error: HIPBLAS_STATUS_INTERNAL_ERROR when calling hipblasLtMatmul with transpose_mat1 0 transpose_mat2 0 m 4096 n 4 k 4 lda 4096 ldb 4 ldc 4096 abType 14 cType 14 computeType 2 scaleType 0. Will attempt to recover by calling cublas instead. (Triggered internally at C:/b/pytorch/aten/src/ATen/hip/HIPBlas.cpp:560.)
  return F.conv2d(
!!! Exception during processing !!! CUDA error: HIPBLAS_STATUS_ALLOC_FAILED when calling `hipblasCreate(handle)`
Traceback (most recent call last):
  File "E:\AI\ComfyUI\ComfyUI\execution.py", line 515, in execute
    output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                                                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\AI\ComfyUI\ComfyUI\execution.py", line 329, in get_output_data
    return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, v3_data=v3_data)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\AI\ComfyUI\ComfyUI\execution.py", line 303, in _async_map_node_over_list
    await process_inputs(input_dict, i)
  File "E:\AI\ComfyUI\ComfyUI\execution.py", line 291, in process_inputs
    result = f(**inputs)
             ^^^^^^^^^^^
  File "E:\AI\ComfyUI\ComfyUI\nodes.py", line 298, in decode
    images = vae.decode(samples["samples"])
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\AI\ComfyUI\ComfyUI\comfy\sd.py", line 783, in decode
    out = self.process_output(self.first_stage_model.decode(samples, **vae_options).to(self.output_device).float())
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\AI\ComfyUI\ComfyUI\comfy\ldm\models\autoencoder.py", line 252, in decode
    dec = self.post_quant_conv(z)
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\AI\ComfyUI\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1775, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\AI\ComfyUI\python_embeded\Lib\site-packages\torch\nn\modules\module.py", line 1786, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\AI\ComfyUI\ComfyUI\comfy\ops.py", line 200, in forward
    return super().forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\AI\ComfyUI\python_embeded\Lib\site-packages\torch\nn\modules\conv.py", line 548, in forward
    return self._conv_forward(input, self.weight, self.bias)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "E:\AI\ComfyUI\python_embeded\Lib\site-packages\torch\nn\modules\conv.py", line 543, in _conv_forward
    return F.conv2d(
           ^^^^^^^^^
RuntimeError: CUDA error: HIPBLAS_STATUS_ALLOC_FAILED when calling `hipblasCreate(handle)`

Prompt executed in 30.49 seconds

~~This is the flowchart:~~

~~Update:~~ ~~After some attempts, I found that I couldn't generate images with a width and height greater than or equal to 512. For example, a 384x384 image can be generated without freezing:~~

Dec 11 '25 14:12 chenshaoju