Hip memory issue in 9070xt with rocm-7.0.2
Custom Node Testing
- [x] I have tried disabling custom nodes and the issue persists (see how to disable custom nodes if you need help)
Expected Behavior
a simpe genetration of an image,
Actual Behavior
during image genetration the following error message appears
Steps to Reproduce
install rocm-7.0.2, use a default workflow with ksampler -> ultimate SD upscale -> face detailer
after one image, the following will fail non stop,
Debug Logs
# ComfyUI Error Report
## Error Details
- **Node ID:** 116
- **Node Type:** FaceDetailer
- **Exception Type:** torch.AcceleratorError
- **Exception Message:** HIP error: an illegal memory access was encountered
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
## Stack Trace
File "/home/lasse/ComfyUI/execution.py", line 496, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/execution.py", line 315, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/comfyui-lora-manager/py/metadata_collector/metadata_hook.py", line 165, in async_map_node_over_list_with_metadata
results = await original_map_node_over_list(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/execution.py", line 289, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "/home/lasse/ComfyUI/execution.py", line 277, in process_inputs
result = f(**inputs)
^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/impact_pack.py", line 876, in doit
enhanced_img, cropped_enhanced, cropped_enhanced_alpha, mask, cnet_pil_list = FaceDetailer.enhance_face(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/impact_pack.py", line 830, in enhance_face
DetailerForEach.do_detail(image, segs, model, clip, vae, guide_size, guide_size_for_bbox, max_size, seed, steps, cfg,
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/impact_pack.py", line 362, in do_detail
enhanced_image, cnet_pils = core.enhance_detail(cropped_image, model, clip, vae, guide_size, guide_size_for_bbox, max_size,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/core.py", line 383, in enhance_detail
refined_latent = impact_sampling.ksampler_wrapper(model2, seed2, steps2, cfg2, sampler_name2, scheduler2, positive2, negative2,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/impact_sampling.py", line 209, in ksampler_wrapper
refined_latent = separated_sample(model, True, seed, advanced_steps, cfg, sampler_name, scheduler,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/impact_sampling.py", line 182, in separated_sample
res = sample_with_custom_noise(model, add_noise, seed, cfg, positive, negative, impact_sampler, sigmas, latent_image, noise=noise, callback=callback)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/impact_sampling.py", line 126, in sample_with_custom_noise
samples = comfy.sample.sample_custom(model, noise, cfg, sampler, sigmas, positive, negative, latent_image,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/sample.py", line 50, in sample_custom
samples = comfy.samplers.sample(model, noise, positive, negative, cfg, model.load_device, sampler, sigmas, model_options=model.model_options, latent_image=latent_image, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 1044, in sample
return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 1029, in sample
output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 997, in outer_sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 980, in inner_sample
samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 752, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/k_diffusion/sampling.py", line 795, in sample_dpmpp_2m
denoised = model(x, sigmas[i] * s_in, **extra_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 401, in __call__
out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 953, in __call__
return self.outer_predict_noise(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 960, in outer_predict_noise
).execute(x, timestep, model_options, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 963, in predict_noise
return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 381, in sampling_function
out = calc_cond_batch(model, conds, x, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 206, in calc_cond_batch
return _calc_cond_batch_outer(model, conds, x_in, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 214, in _calc_cond_batch_outer
return executor.execute(model, conds, x_in, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 326, in _calc_cond_batch
output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/model_base.py", line 161, in apply_model
return comfy.patcher_extension.WrapperExecutor.new_class_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/model_base.py", line 200, in _apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 831, in forward
return comfy.patcher_extension.WrapperExecutor.new_class_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 873, in _forward
h = forward_timestep_embed(module, h, emb, context, transformer_options, time_context=time_context, num_video_frames=num_video_frames, image_only_indicator=image_only_indicator)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 38, in forward_timestep_embed
x = layer(x, emb)
^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 239, in forward
return checkpoint(
^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/ldm/modules/diffusionmodules/util.py", line 191, in checkpoint
return func(*inputs)
^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/ldm/modules/diffusionmodules/openaimodel.py", line 252, in _forward
h = self.in_layers(x)
^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/container.py", line 244, in forward
input = module(input)
^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/ops.py", line 146, in forward
return self.forward_comfy_cast_weights(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/ops.py", line 141, in forward_comfy_cast_weights
return self._conv_forward(input, weight, bias)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/conv.py", line 543, in _conv_forward
return F.conv2d(
^^^^^^^^^
## System Information
- **ComfyUI Version:** 0.3.65
- **Arguments:** main.py --listen 0.0.0.0 --output-directory /home/lasse/MEGA/ComfyUI --use-pytorch-cross-attention --reserve-vram 1 --lowvram --fast --disable-smart-memory
- **OS:** posix
- **Python Version:** 3.12.3 (main, Aug 14 2025, 17:47:21) [GCC 13.3.0]
- **Embedded Python:** false
- **PyTorch Version:** 2.8.0+rocm7.0.2.git245bf6ed
## Devices
- **Name:** cuda:0 AMD Radeon Graphics : native
- **Type:** cuda
- **VRAM Total:** 17095983104
- **VRAM Free:** 15807266816
- **Torch VRAM Total:** 161480704
- **Torch VRAM Free:** 17809408
## Logs
2025-10-16T15:00:55.196764 - 2025-10-16T15:00:55.196864 - File "/home/lasse/ComfyUI/main.py", line 195, in prompt_worker
2025-10-16T15:00:55.197216 - 2025-10-16T15:00:55.197287 - 2025-10-16T15:00:55.197397 - e.execute(item[2], prompt_id, item[3], item[4])2025-10-16T15:00:55.197526 -
2025-10-16T15:00:55.197670 - 2025-10-16T15:00:55.197738 - File "/home/lasse/ComfyUI/execution.py", line 649, in execute
2025-10-16T15:00:55.198239 - 2025-10-16T15:00:55.198317 - 2025-10-16T15:00:55.198431 - asyncio.run(self.execute_async(prompt, prompt_id, extra_data, execute_outputs))2025-10-16T15:00:55.198526 -
2025-10-16T15:00:55.198680 - 2025-10-16T15:00:55.198785 - File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
2025-10-16T15:00:55.199070 - 2025-10-16T15:00:55.199147 - 2025-10-16T15:00:55.199274 - return runner.run(main)2025-10-16T15:00:55.199349 -
2025-10-16T15:00:55.199456 - 2025-10-16T15:00:55.199528 - 2025-10-16T15:00:55.199681 - 2025-10-16T15:00:55.199816 - 2025-10-16T15:00:55.199893 - 2025-10-16T15:00:55.199987 - 2025-10-16T15:00:55.200055 - 2025-10-16T15:00:55.200146 - 2025-10-16T15:00:55.200238 - 2025-10-16T15:00:55.200311 - 2025-10-16T15:00:55.200410 - 2025-10-16T15:00:55.200501 - 2025-10-16T15:00:55.200600 - ^2025-10-16T15:00:55.200710 - ^2025-10-16T15:00:55.200800 - ^2025-10-16T15:00:55.200891 - ^2025-10-16T15:00:55.200983 - ^2025-10-16T15:00:55.201043 - ^2025-10-16T15:00:55.201137 - ^2025-10-16T15:00:55.201219 - ^2025-10-16T15:00:55.201318 - ^2025-10-16T15:00:55.201490 - ^2025-10-16T15:00:55.201644 - ^2025-10-16T15:00:55.201754 - ^2025-10-16T15:00:55.201816 - ^2025-10-16T15:00:55.201906 - ^2025-10-16T15:00:55.201979 - ^2025-10-16T15:00:55.202075 - ^2025-10-16T15:00:55.202149 -
2025-10-16T15:00:55.202239 - 2025-10-16T15:00:55.202334 - File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
2025-10-16T15:00:55.202522 - 2025-10-16T15:00:55.202670 - 2025-10-16T15:00:55.202772 - return self._loop.run_until_complete(task)2025-10-16T15:00:55.202874 -
2025-10-16T15:00:55.202987 - 2025-10-16T15:00:55.203060 - 2025-10-16T15:00:55.203155 - 2025-10-16T15:00:55.203227 - 2025-10-16T15:00:55.203328 - 2025-10-16T15:00:55.203411 - 2025-10-16T15:00:55.203515 - 2025-10-16T15:00:55.203671 - 2025-10-16T15:00:55.203776 - 2025-10-16T15:00:55.203849 - 2025-10-16T15:00:55.203940 - 2025-10-16T15:00:55.204007 - 2025-10-16T15:00:55.204097 - ^2025-10-16T15:00:55.204188 - ^2025-10-16T15:00:55.204276 - ^2025-10-16T15:00:55.204382 - ^2025-10-16T15:00:55.204471 - ^2025-10-16T15:00:55.204533 - ^2025-10-16T15:00:55.204684 - ^2025-10-16T15:00:55.204785 - ^2025-10-16T15:00:55.204878 - ^2025-10-16T15:00:55.204978 - ^2025-10-16T15:00:55.205080 - ^2025-10-16T15:00:55.205169 - ^2025-10-16T15:00:55.205229 - ^2025-10-16T15:00:55.205359 - ^2025-10-16T15:00:55.205448 - ^2025-10-16T15:00:55.205537 - ^2025-10-16T15:00:55.205611 - ^2025-10-16T15:00:55.205718 - ^2025-10-16T15:00:55.205806 - ^2025-10-16T15:00:55.205866 - ^2025-10-16T15:00:55.205960 - ^2025-10-16T15:00:55.206042 - ^2025-10-16T15:00:55.206148 - ^2025-10-16T15:00:55.206303 - ^2025-10-16T15:00:55.206435 - ^2025-10-16T15:00:55.206525 - ^2025-10-16T15:00:55.206648 - ^2025-10-16T15:00:55.206725 - ^2025-10-16T15:00:55.206818 - ^2025-10-16T15:00:55.206918 - ^2025-10-16T15:00:55.206991 - ^2025-10-16T15:00:55.207122 - ^2025-10-16T15:00:55.207184 - ^2025-10-16T15:00:55.207312 - ^2025-10-16T15:00:55.207407 - ^2025-10-16T15:00:55.207536 -
2025-10-16T15:00:55.207613 - 2025-10-16T15:00:55.207712 - File "/usr/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
2025-10-16T15:00:55.208176 - 2025-10-16T15:00:55.208256 - 2025-10-16T15:00:55.208357 - return future.result()2025-10-16T15:00:55.208460 -
2025-10-16T15:00:55.208652 - 2025-10-16T15:00:55.208728 - 2025-10-16T15:00:55.208858 - 2025-10-16T15:00:55.208998 - 2025-10-16T15:00:55.209071 - 2025-10-16T15:00:55.209147 - 2025-10-16T15:00:55.209247 - 2025-10-16T15:00:55.209406 - 2025-10-16T15:00:55.209499 - 2025-10-16T15:00:55.209572 - 2025-10-16T15:00:55.209687 - 2025-10-16T15:00:55.209787 - 2025-10-16T15:00:55.209848 - ^2025-10-16T15:00:55.209944 - ^2025-10-16T15:00:55.210071 - ^2025-10-16T15:00:55.210136 - ^2025-10-16T15:00:55.210233 - ^2025-10-16T15:00:55.210362 - ^2025-10-16T15:00:55.210423 - ^2025-10-16T15:00:55.210550 - ^2025-10-16T15:00:55.210695 - ^2025-10-16T15:00:55.210760 - ^2025-10-16T15:00:55.210889 - ^2025-10-16T15:00:55.210949 - ^2025-10-16T15:00:55.211074 - ^2025-10-16T15:00:55.211163 - ^2025-10-16T15:00:55.211259 - ^2025-10-16T15:00:55.211347 -
2025-10-16T15:00:55.211471 - 2025-10-16T15:00:55.211560 - File "/home/lasse/ComfyUI/execution.py", line 722, in execute_async
2025-10-16T15:00:55.211863 - 2025-10-16T15:00:55.211942 - 2025-10-16T15:00:55.212055 - comfy.model_management.unload_all_models()2025-10-16T15:00:55.212158 -
2025-10-16T15:00:55.212296 - 2025-10-16T15:00:55.212396 - File "/home/lasse/ComfyUI/comfy/model_management.py", line 1399, in unload_all_models
2025-10-16T15:00:55.212996 - 2025-10-16T15:00:55.213065 - 2025-10-16T15:00:55.213159 - free_memory(1e30, get_torch_device())2025-10-16T15:00:55.213268 -
2025-10-16T15:00:55.213380 - 2025-10-16T15:00:55.213508 - 2025-10-16T15:00:55.213651 - 2025-10-16T15:00:55.213750 - 2025-10-16T15:00:55.213881 - 2025-10-16T15:00:55.214008 - 2025-10-16T15:00:55.214068 - 2025-10-16T15:00:55.214199 - 2025-10-16T15:00:55.214306 - 2025-10-16T15:00:55.214439 - 2025-10-16T15:00:55.214545 - 2025-10-16T15:00:55.214697 - 2025-10-16T15:00:55.214810 - 2025-10-16T15:00:55.214901 - 2025-10-16T15:00:55.214993 - 2025-10-16T15:00:55.215083 - 2025-10-16T15:00:55.215144 - 2025-10-16T15:00:55.215271 - 2025-10-16T15:00:55.215366 - 2025-10-16T15:00:55.215493 - 2025-10-16T15:00:55.215597 - 2025-10-16T15:00:55.215699 - 2025-10-16T15:00:55.215789 - 2025-10-16T15:00:55.215875 - ^2025-10-16T15:00:55.215962 - ^2025-10-16T15:00:55.216066 - ^2025-10-16T15:00:55.216210 - ^2025-10-16T15:00:55.216316 - ^2025-10-16T15:00:55.216407 - ^2025-10-16T15:00:55.216506 - ^2025-10-16T15:00:55.216644 - ^2025-10-16T15:00:55.216709 - ^2025-10-16T15:00:55.216836 - ^2025-10-16T15:00:55.216927 - ^2025-10-16T15:00:55.216988 - ^2025-10-16T15:00:55.217112 - ^2025-10-16T15:00:55.217232 - ^2025-10-16T15:00:55.217346 - ^2025-10-16T15:00:55.217407 - ^2025-10-16T15:00:55.217498 - ^2025-10-16T15:00:55.217633 - ^2025-10-16T15:00:55.217729 -
2025-10-16T15:00:55.217836 - 2025-10-16T15:00:55.217977 - File "/home/lasse/ComfyUI/comfy/model_management.py", line 187, in get_torch_device
2025-10-16T15:00:55.218166 - 2025-10-16T15:00:55.218231 - 2025-10-16T15:00:55.218364 - return torch.device(torch.cuda.current_device())2025-10-16T15:00:55.218464 -
2025-10-16T15:00:55.218576 - 2025-10-16T15:00:55.218698 - 2025-10-16T15:00:55.218796 - 2025-10-16T15:00:55.218869 - 2025-10-16T15:00:55.218998 - 2025-10-16T15:00:55.219059 - 2025-10-16T15:00:55.219184 - 2025-10-16T15:00:55.219246 - 2025-10-16T15:00:55.219371 - 2025-10-16T15:00:55.219467 - 2025-10-16T15:00:55.219563 - 2025-10-16T15:00:55.219639 - 2025-10-16T15:00:55.219733 - 2025-10-16T15:00:55.219803 - 2025-10-16T15:00:55.219876 - 2025-10-16T15:00:55.219987 - 2025-10-16T15:00:55.220118 - 2025-10-16T15:00:55.220223 - 2025-10-16T15:00:55.220289 - 2025-10-16T15:00:55.220378 - 2025-10-16T15:00:55.220438 - 2025-10-16T15:00:55.220567 - 2025-10-16T15:00:55.220676 - 2025-10-16T15:00:55.220766 - 2025-10-16T15:00:55.220855 - 2025-10-16T15:00:55.220950 - ^2025-10-16T15:00:55.221046 - ^2025-10-16T15:00:55.221172 - ^2025-10-16T15:00:55.221266 - ^2025-10-16T15:00:55.221393 - ^2025-10-16T15:00:55.221482 - ^2025-10-16T15:00:55.221543 - ^2025-10-16T15:00:55.221650 - ^2025-10-16T15:00:55.221785 - ^2025-10-16T15:00:55.221846 - ^2025-10-16T15:00:55.221971 - ^2025-10-16T15:00:55.222057 - ^2025-10-16T15:00:55.222123 - ^2025-10-16T15:00:55.222200 - ^2025-10-16T15:00:55.222296 - ^2025-10-16T15:00:55.222436 - ^2025-10-16T15:00:55.222538 - ^2025-10-16T15:00:55.222678 - ^2025-10-16T15:00:55.222745 - ^2025-10-16T15:00:55.222833 - ^2025-10-16T15:00:55.222921 - ^2025-10-16T15:00:55.222981 - ^2025-10-16T15:00:55.223106 - ^2025-10-16T15:00:55.223200 - ^2025-10-16T15:00:55.223327 - ^2025-10-16T15:00:55.223414 - ^2025-10-16T15:00:55.223511 - ^2025-10-16T15:00:55.223656 -
2025-10-16T15:00:55.223733 - 2025-10-16T15:00:55.223863 - File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/cuda/__init__.py", line 1072, in current_device
2025-10-16T15:00:55.224413 - 2025-10-16T15:00:55.224522 - 2025-10-16T15:00:55.224595 - return torch._C._cuda_getDevice()2025-10-16T15:00:55.224733 -
2025-10-16T15:00:55.224819 - 2025-10-16T15:00:55.224897 - 2025-10-16T15:00:55.224997 - 2025-10-16T15:00:55.225137 - 2025-10-16T15:00:55.225239 - 2025-10-16T15:00:55.225307 - 2025-10-16T15:00:55.225403 - 2025-10-16T15:00:55.225504 - 2025-10-16T15:00:55.225576 - 2025-10-16T15:00:55.225699 - 2025-10-16T15:00:55.225795 - 2025-10-16T15:00:55.225885 - 2025-10-16T15:00:55.225957 - ^2025-10-16T15:00:55.226083 - ^2025-10-16T15:00:55.226179 - ^2025-10-16T15:00:55.226276 - ^2025-10-16T15:00:55.226371 - ^2025-10-16T15:00:55.226496 - ^2025-10-16T15:00:55.226634 - ^2025-10-16T15:00:55.226711 - ^2025-10-16T15:00:55.226837 - ^2025-10-16T15:00:55.226957 - ^2025-10-16T15:00:55.227047 - ^2025-10-16T15:00:55.227144 - ^2025-10-16T15:00:55.227240 - ^2025-10-16T15:00:55.227322 - ^2025-10-16T15:00:55.227503 - ^2025-10-16T15:00:55.227635 - ^2025-10-16T15:00:55.227736 - ^2025-10-16T15:00:55.227833 - ^2025-10-16T15:00:55.227960 - ^2025-10-16T15:00:55.228059 - ^2025-10-16T15:00:55.228155 - ^2025-10-16T15:00:55.228243 - ^2025-10-16T15:00:55.228336 - ^2025-10-16T15:00:55.228460 - ^2025-10-16T15:00:55.228554 - ^2025-10-16T15:00:55.228699 - ^2025-10-16T15:00:55.228804 -
2025-10-16T15:00:55.228902 - 2025-10-16T15:00:55.229002 - torch2025-10-16T15:00:55.229121 - .2025-10-16T15:00:55.229247 - AcceleratorError2025-10-16T15:00:55.229308 - : 2025-10-16T15:00:55.229440 - HIP error: an illegal memory access was encountered
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
2025-10-16T15:00:55.229563 -
## Attached Workflow
Please make sure that workflow does not contain any sensitive information such as API keys or passwords.
Workflow too large. Please manually upload the workflow from local file system.
## Additional Context
(Please add any additional context or steps to reproduce the error here)
Other
No response
I have an identical issue (9070XT),any Rocm 7 build. Both latest amd proprietary drivers and the ubuntu base amd drivers.
Kubuntu 24.04, latest nightly comfyui, but it's been over the past few weeks of comfyui builds.
I'm just using the pytorch rocm 6.4 for now because it doesn't crash like this.
How did you install rocm.. with the amdgpu-installer and with DKMS or in another way ?
with the amdgpu-installer and with DKMS
yes,
and with pip3 using the pip3 install -U --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/rocm7.0
I have the same issue. Here is how I installed rocm. I've since reverted but these were my commands
wget https://repo.radeon.com/amdgpu-install/7.0.2/ubuntu/noble/amdgpu-install_7.0.2.70002-1_all.deb sudo apt install ./amdgpu-install_7.0.2.70002-1_all.deb sudo apt update sudo apt install rocm apt list --installed | grep rocm #Confirming the version
#Then in my test venv source newenv/bin/activate pip list pip uninstall torch torchaudio torchvision pip3 install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/rocm7.0
For some reason this broke my other VENV for comfyui running the stable with rocm 6.4. But idk if that's a comfyui issue or my system just getting confused with rocm7 installed while using pytorch stable rocm 6.4.
My error
hidden_states_slice = torch.bmm(attn_probs.to(value.dtype), value)
/home/auser/git/comfy/ComfyUI/comfy/ldm/modules/sub_quadratic_attention.py:180: UserWarning: HIP warning: an illegal memory access was encountered (Triggered internally at /pytorch/aten/src/ATen/hip/impl/HIPGuardImplMasqueradingAsCUDA.h:83.)
hidden_states_slice = torch.bmm(attn_probs.to(value.dtype), value)
0%| | 0/2 [00:06<?, ?it/s]
!!! Exception during processing !!! HIP error: an illegal memory access was encountered
Search for `hipErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__HIPRT__TYPES.html for more information.
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
Traceback (most recent call last):
File "/home/auser/git/comfy/ComfyUI/execution.py", line 496, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/execution.py", line 315, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/execution.py", line 289, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "/home/auser/git/comfy/ComfyUI/execution.py", line 277, in process_inputs
result = f(**inputs)
^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/nodes.py", line 1559, in sample
return common_ksampler(model, noise_seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise, disable_noise=disable_noise, start_step=start_at_step, last_step=end_at_step, force_full_denoise=force_full_denoise)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/nodes.py", line 1492, in common_ksampler
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/sample.py", line 45, in sample
samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 1161, in sample
return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 1051, in sample
return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 1036, in sample
output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 1004, in outer_sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 987, in inner_sample
samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 759, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/newenv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 122, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/k_diffusion/sampling.py", line 199, in sample_euler
denoised = model(x, sigma_hat * s_in, **extra_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 408, in __call__
out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 960, in __call__
return self.outer_predict_noise(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 967, in outer_predict_noise
).execute(x, timestep, model_options, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 970, in predict_noise
return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 388, in sampling_function
out = calc_cond_batch(model, conds, x, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 206, in calc_cond_batch
return _calc_cond_batch_outer(model, conds, x_in, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 214, in _calc_cond_batch_outer
return executor.execute(model, conds, x_in, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/samplers.py", line 333, in _calc_cond_batch
output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/model_base.py", line 160, in apply_model
return comfy.patcher_extension.WrapperExecutor.new_class_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/model_base.py", line 199, in _apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/newenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1780, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/newenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1791, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/ldm/wan/model.py", line 614, in forward
return comfy.patcher_extension.WrapperExecutor.new_class_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/ldm/wan/model.py", line 634, in _forward
return self.forward_orig(x, timestep, context, clip_fea=clip_fea, freqs=freqs, transformer_options=transformer_options, **kwargs)[:, :, :t, :h, :w]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/ldm/wan/model.py", line 579, in forward_orig
x = block(x, e=e0, freqs=freqs, context=context, context_img_len=context_img_len, transformer_options=transformer_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/newenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1780, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/newenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1791, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/ldm/wan/model.py", line 235, in forward
y = self.self_attn(
^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/newenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1780, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/newenv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1791, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/ldm/wan/model.py", line 81, in forward
x = optimized_attention(
^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/ldm/modules/attention.py", line 130, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/ldm/modules/attention.py", line 257, in attention_sub_quad
hidden_states = efficient_dot_product_attention(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/ldm/modules/sub_quadratic_attention.py", line 268, in efficient_dot_product_attention
compute_query_chunk_attn(
File "/home/auser/git/comfy/ComfyUI/comfy/ldm/modules/sub_quadratic_attention.py", line 180, in _get_attention_scores_no_kv_chunking
hidden_states_slice = torch.bmm(attn_probs.to(value.dtype), value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.AcceleratorError: HIP error: an illegal memory access was encountered
Search for `hipErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__HIPRT__TYPES.html for more information.
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
Exception in thread Thread-2 (prompt_worker):
Traceback (most recent call last):
File "/usr/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
self.run()
File "/usr/lib/python3.12/threading.py", line 1010, in run
self._target(*self._args, **self._kwargs)
File "/home/auser/git/comfy/ComfyUI/main.py", line 195, in prompt_worker
e.execute(item[2], prompt_id, item[3], item[4])
File "/home/auser/git/comfy/ComfyUI/execution.py", line 649, in execute
asyncio.run(self.execute_async(prompt, prompt_id, extra_data, execute_outputs))
File "/usr/lib/python3.12/asyncio/runners.py", line 194, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/base_events.py", line 687, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/execution.py", line 722, in execute_async
comfy.model_management.unload_all_models()
File "/home/auser/git/comfy/ComfyUI/comfy/model_management.py", line 1402, in unload_all_models
free_memory(1e30, get_torch_device())
^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/comfy/model_management.py", line 187, in get_torch_device
return torch.device(torch.cuda.current_device())
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/auser/git/comfy/ComfyUI/newenv/lib/python3.12/site-packages/torch/cuda/__init__.py", line 1080, in current_device
return torch._C._cuda_getDevice()
^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.AcceleratorError: HIP error: an illegal memory access was encountered
Search for `hipErrorIllegalAddress' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__HIPRT__TYPES.html for more information.
Compile with `TORCH_USE_HIP_DSA` to enable device-side assertions.
Side note
The output says
set AMD_SERIALIZE_KERNEL=3 But setting that gets
UserWarning: Ignoring invalid value for boolean flag AMD_SERIALIZE_KERNEL: 3valid values are 0 or 1.
I set these env variables as a part of my troubleshooting.
export HIP_VISIBLE_DEVICES=0
export HIP_LAUNCH_BLOCKING=1
export AMD_SERIALIZE_KERNEL=1
export TORCH_BLAS_PREFER_HIPBLASLT=0
Env details
Name: AMD Ryzen 9 7950X3D 16-Core Processor
Marketing Name: AMD Ryzen 9 7950X3D 16-Core Processor
Vendor Name: CPU
Name: gfx1201
Marketing Name: AMD Radeon RX 9070 XT
Vendor Name: AMD
Name: amdgcn-amd-amdhsa--gfx1201
Name: amdgcn-amd-amdhsa--gfx12-generic
torch 2.10.0.dev20251016+rocm7.0
HIP runtime: 7.0.51831-a3e329ad8
ROCm detected: True
Device: AMD Radeon RX 9070 XT
I also double checked I have update comfyui to the latest
comfyui-embedded-docs 0.3.0
comfyui_frontend_package 1.28.7
comfyui_workflow_templates 0.1.95
I wouldn't be surprised if this also was a pytorch/rocm issue. But I didn't see any open issues on those repo's and I did see this one. Thank's fellas. I've since reverted back to 6.4 but I'm happy to do another upgrade test. I suppose my life would be a bit easier if I used docker.
this seems to be a growing issue with 9000 series i see a lot of people complaining about rocm / python instability, but no resolution a few says it works, but none of them have so far been willing to describe how they have done it (so I considder it unreliable for now)
Same issue on the 7900GRE (gfx1100) with ROCm 7.x. Tried all the ROCm parameters in the book, nothing works.
Same here - 9070XT with ROCm 7.x. It's not happening every time, but frequently enough to be an interrupt to the workflow. It's always in the KSampler node for me.
System information: https://termbin.com/8e28 Output: https://gist.github.com/Nihlus/59edf3a7fc5ebb21c6bd5e243705b448
Can you try to install 7.0.2 official wheels from here: https://rocm.docs.amd.com/projects/radeon-ryzen/en/latest/docs/install/installrad/native_linux/install-pytorch.html#install-pytorch-via-pip
Can you try to install 7.0.2 official wheels from here: https://rocm.docs.amd.com/projects/radeon-ryzen/en/latest/docs/install/installrad/native_linux/install-pytorch.html#install-pytorch-via-pip
Installing torch from there (rocm-7.0.2) does not fix the issue for me
Same for me... neither reinstall, upgrade or any of the wheels fixes it for me same error every time
Using Distorch2 MultiGPU samples solved all my problems.
How did you set it up ?? the Distorch2 MultiGPU samples
cause i still get:
untered
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with TORCH_USE_HIP_DSA to enable device-side assertions.
Traceback (most recent call last):
File "/home/lasse/ComfyUI/execution.py", line 499, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/execution.py", line 316, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/ComfyUI-Lora-Manager/py/metadata_collector/metadata_hook.py", line 165, in async_map_node_over_list_with_metadata
results = await original_map_node_over_list(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/execution.py", line 290, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "/home/lasse/ComfyUI/execution.py", line 278, in process_inputs
result = f(**inputs)
^^^^^^^^^^^
File "/home/lasse/ComfyUI/nodes.py", line 1525, in sample
return common_ksampler(model, seed, steps, cfg, sampler_name, scheduler, positive, negative, latent_image, denoise=denoise)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/nodes.py", line 1492, in common_ksampler
samples = comfy.sample.sample(model, noise, steps, cfg, sampler_name, scheduler, positive, negative, latent_image,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/sample.py", line 60, in sample
samples = sampler.sample(noise, positive, negative, cfg=cfg, latent_image=latent_image, start_step=start_step, last_step=last_step, force_full_denoise=force_full_denoise, denoise_mask=noise_mask, sigmas=sigmas, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 1163, in sample
return sample(self.model, noise, positive, negative, cfg, self.device, sampler, sigmas, self.model_options, latent_image=latent_image, denoise_mask=denoise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 1053, in sample
return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 1035, in sample
output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 997, in outer_sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 980, in inner_sample
samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 752, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/k_diffusion/sampling.py", line 800, in sample_dpmpp_2m
if old_denoised is None or sigmas[i + 1] == 0:
^^^^^^^^^^^^^^^^^^
torch.AcceleratorError: HIP error: an illegal memory access was encountered
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with TORCH_USE_HIP_DSA to enable device-side assertions.
Prompt executed in 202.07 seconds
Exception in thread Thread-4 (prompt_worker):
Traceback (most recent call last):
File "/usr/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
self.run()
File "/usr/lib/python3.12/threading.py", line 1010, in run
self._target(*self._args, **self._kwargs)
File "/home/lasse/ComfyUI/main.py", line 240, in prompt_worker
comfy.model_management.soft_empty_cache()
File "/home/lasse/ComfyUI/comfy/model_management.py", line 1432, in soft_empty_cache
torch.cuda.empty_cache()
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/cuda/memory.py", line 224, in empty_cache
torch._C._cuda_emptyCache()
torch.AcceleratorError: HIP error: an illegal memory access was encountered
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with TORCH_USE_HIP_DSA to enable device-side assertions.
I run comfyui with the following batch file:
set PYTORCH_HIP_ALLOC_CONF=max_split_size_mb:512
set PYTORCH_ALLOC_CONF=max_split_size_mb:512
set HSA_OVERRIDE_GFX_VERSION=11.0.0
set PYTORCH_HIP_FORCE_SHUTDOWN=1
cmd /c "C:\ComfyUI\venv\Scripts\activate.bat && cd C:\ComfyUI && python main.py --use-pytorch-cross-attention --disable-smart-memory --listen"
Then regardless of whether I run Qwen, WAN, or Flux, I use this loader with 'virtual_vram_gb' set to around 80% of the file size of the model. Example:
So far, every workflow I throw at it works flawlessly.
Thanks for input :) but sadly still failes for me. same OOM
Still happening with updated rocm (https://repo.radeon.com/rocm/apt/7.1/) and torch (https://repo.radeon.com/rocm/manylinux/rocm-rel-7.1/)
On the bright side TorchCompileModel is now also broken and only generates black images (if illegal memory access doesn't happen first)
I have now tried to upgrade and run with the latest 7.1 just released.. I can make 1 picture, if im lucky, but all subsequent pictures fail with the same oom almost at once.
I have just tried to reinstall Ubuntu 24.04 completly, made a apt get update && upgrade -y reboot then installed rocm with the following commands:
sudo apt install ./amdgpu-install_7.1.70100-1_all.deb sudo apt update sudo apt install python3-setuptools python3-wheel sudo usermod -a -G render,video $LOGNAME
reboot sudo apt install -y rocm-opencl-runtime sudo apt purge -y rocminfo || true sudo amdgpu-install -y --usecase=rocm,graphics,hiplibsdk --no-dkms
sudo amdgpu-install -y --usecase=graphics,hiplibsdk,rocm,mllib --no-dkms
sudo apt install -y python3-venv git python3-setuptools python3-wheel
graphicsmagick-imagemagick-compat llvm clang cmake gcc g++ ninja-build radeontop
libamd-comgr2 libhsa-runtime64-1 librccl1 librocalution0 librocblas0 librocfft0
librocm-smi64-1 librocsolver0 librocsparse0 rocm-device-libs-17 rocm-smi hipcc
libhiprand1 libhiprtc-builtins5
export PATH=$PATH:/opt/rocm/bin export LD_LIBRARY_PATH=/opt/rocm/lib sudo tee /etc/ld.so.conf.d/rocm.conf <<EOF /opt/rocm/lib /opt/rocm/lib64 EOF sudo ldconfig
reboot
git clone https://github.com/comfyanonymous/ComfyUI cd ComfyUI python3 -m venv .venv source .venv/bin/activate pip install --upgrade pip wheel setuptools
echo "📥 Downloading ROCm PyTorch wheels..." pip install -r requirements.txt wget https://repo.radeon.com/rocm/manylinux/rocm-rel-7.1/torch-2.8.0%2Brocm7.1.0.lw.git7a520360-cp312-cp312-linux_x86_64.whl wget https://repo.radeon.com/rocm/manylinux/rocm-rel-7.1/torchvision-0.23.0%2Brocm7.1.0.git824e8c87-cp312-cp312-linux_x86_64.whl wget https://repo.radeon.com/rocm/manylinux/rocm-rel-7.1/triton-3.4.0%2Brocm7.1.0.gitf9e5bf54-cp312-cp312-linux_x86_64.whl wget https://repo.radeon.com/rocm/manylinux/rocm-rel-7.1/torchaudio-2.8.0%2Brocm7.1.0.git6e1c7fe9-cp312-cp312-linux_x86_64.whl pip3 uninstall torch torchvision triton torchaudio pip3 install torch-2.8.0+rocm7.1.0.lw.git7a520360-cp312-cp312-linux_x86_64.whl torchvision-0.23.0+rocm7.1.0.git824e8c87-cp312-cp312-linux_x86_64.whl torchaudio-2.8.0+rocm7.1.0.git6e1c7fe9-cp312-cp312-linux_x86_64.whl triton-3.4.0+rocm7.1.0.gitf9e5bf54-cp312-cp312-linux_x86_64.whl pip install matplotlib pandas simpleeval pip install comfyui-frontend-package --upgrade
echo "🧩 Installing ComfyUI extensions..." cd custom_nodes git clone -b AMD https://github.com/crystian/ComfyUI-Crystools.git && cd ComfyUI-Crystools && pip install -r requirements.txt && cd .. git clone https://github.com/ltdrdata/ComfyUI-Manager comfyui-manager && cd comfyui-manager && pip install -r requirements.txt && cd .. pip install diffusers git clone https://github.com/pnikolic-amd/ComfyUI_MIGraphX.git && cd ComfyUI_MIGraphX && pip install -r requirements.txt && cd .. git clone https://github.com/ltdrdata/comfyui-unsafe-torch git clone https://github.com/ltdrdata/ComfyUI-Impact-Pack comfyui-impact-pack && cd comfyui-impact-pack && pip install -r requirements.txt && cd .. git clone https://github.com/ltdrdata/ComfyUI-Impact-Subpack && cd ComfyUI-Impact-Subpack && pip install -r requirements.txt && cd .. git clone https://github.com/chengzeyi/Comfy-WaveSpeed.git git clone https://github.com/willmiao/ComfyUI-Lora-Manager.git cd ComfyUI-Lora-Manager pip install -r requirements.txt cd .. cd ..
I then started ComfyUI first with the normal python main.py.
OOM right out of the bat.
I then used the following script: #!/bin/bash source .venv/bin/activate
=== ROCm paths ===
export ROCM_PATH="/opt/rocm" export HIP_PATH="$ROCM_PATH" export PATH="$ROCM_PATH/bin:$PATH" export LD_LIBRARY_PATH="$ROCM_PATH/lib:$ROCM_PATH/lib64:$LD_LIBRARY_PATH" export PYTHONPATH="$ROCM_PATH/lib:$ROCM_PATH/lib64:$PYTHONPATH" export HIP_VISIBLE_DEVICES=0 export ROCM_VISIBLE_DEVICES=0
=== GPU targeting ===
export HCC_AMDGPU_TARGET="gfx1201" # Change for your GPU export PYTORCH_ROCM_ARCH="gfx1201" # e.g., gfx1030 for RX 6800/6900
=== Memory allocator tuning ===
export PYTORCH_HIP_ALLOC_CONF="garbage_collection_threshold:0.6,max_split_size_mb:6144"
=== Precision and performance ===
export TORCH_BLAS_PREFER_HIPBLASLT=1 export TORCHINDUCTOR_MAX_AUTOTUNE_GEMM_BACKENDS="CK,TRITON,ROCBLAS" export TORCHINDUCTOR_MAX_AUTOTUNE_GEMM_SEARCH_SPACE="BEST" export TORCHINDUCTOR_FORCE_FALLBACK=1
=== Flash Attention ===
export FLASH_ATTENTION_TRITON_AMD_ENABLE="TRUE" export FLASH_ATTENTION_BACKEND="flash_attn_triton_amd" export FLASH_ATTENTION_TRITON_AMD_SEQ_LEN=4096 export USE_CK=ON export TRANSFORMERS_USE_FLASH_ATTENTION=1 export TRITON_USE_ROCM=ON export TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1
=== CPU threading ===
export OMP_NUM_THREADS=8 export MKL_NUM_THREADS=8 export NUMEXPR_NUM_THREADS=8
=== Experimental ROCm flags ===
export HSA_ENABLE_ASYNC_COPY=0 export HSA_ENABLE_SDMA=1 export MIOPEN_FIND_MODE=2 export MIOPEN_ENABLE_CACHE=1
=== MIOpen cache ===
export MIOPEN_USER_DB_PATH="$HOME/.config/miopen" export MIOPEN_CUSTOM_CACHE_DIR="$HOME/.config/miopen"
=== Launch ComfyUI ===
python3 main.py --listen 0.0.0.0 --output-directory "$HOME/ComfyUI_Output" --normalvram --reserve-vram 2 --use-quad-cross-attention --fast
it now runs the first Ksampler. but when going to upscale, it crashes the entire computer..
So I am now reverting to 6.4.4... that is stable at least.. but please see if you can fix the instability, it prevents me from upgrating to newer versions
@druidican please try to use --use-pytorch-cross-attention instead of --use-quad-cross-attention
Also, I noticed that reverting 2.73 to 1.0 in https://github.com/comfyanonymous/ComfyUI/blob/master/comfy/sd.py#L283 reduces frequency of this issue during my experiments
Also, since https://github.com/comfyanonymous/ComfyUI/pull/10302 has been merged, MIOpen will not be used at all but according to https://github.com/comfyanonymous/ComfyUI/issues/10460 we can try to revert this change and enable MIOpen again but please make sure to set MIOPEN_FIND_MODE=2 before running ComfyUI. Also, make sure to delete ~/.cache/miopen and ~/.config/miopen directories before starting ComfyUI. This is very good explanation how MIOpen works and why first run takes to much time: https://github.com/comfyanonymous/ComfyUI/pull/10302#issuecomment-3425750147 (because MIOpen needs to benchmark all solutions before caching the best one and MIOPEN_FIND_MODE=2 should speed up things a little bit)
@slojosic-amd I have tried as you surgested.. and I get the following:
!!! Exception during processing !!! HIP error: an illegal memory access was encountered
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with TORCH_USE_HIP_DSA to enable device-side assertions.
Traceback (most recent call last):
File "/home/lasse/ComfyUI/execution.py", line 498, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/execution.py", line 316, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/ComfyUI-Lora-Manager/py/metadata_collector/metadata_hook.py", line 165, in async_map_node_over_list_with_metadata
results = await original_map_node_over_list(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/execution.py", line 290, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "/home/lasse/ComfyUI/execution.py", line 278, in process_inputs
result = f(**inputs)
^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/impact_pack.py", line 876, in doit
enhanced_img, cropped_enhanced, cropped_enhanced_alpha, mask, cnet_pil_list = FaceDetailer.enhance_face(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/impact_pack.py", line 830, in enhance_face
DetailerForEach.do_detail(image, segs, model, clip, vae, guide_size, guide_size_for_bbox, max_size, seed, steps, cfg,
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/impact_pack.py", line 362, in do_detail
enhanced_image, cnet_pils = core.enhance_detail(cropped_image, model, clip, vae, guide_size, guide_size_for_bbox, max_size,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/core.py", line 383, in enhance_detail
refined_latent = impact_sampling.ksampler_wrapper(model2, seed2, steps2, cfg2, sampler_name2, scheduler2, positive2, negative2,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/impact_sampling.py", line 209, in ksampler_wrapper
refined_latent = separated_sample(model, True, seed, advanced_steps, cfg, sampler_name, scheduler,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/impact_sampling.py", line 182, in separated_sample
res = sample_with_custom_noise(model, add_noise, seed, cfg, positive, negative, impact_sampler, sigmas, latent_image, noise=noise, callback=callback)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/comfyui-impact-pack/modules/impact/impact_sampling.py", line 126, in sample_with_custom_noise
samples = comfy.sample.sample_custom(model, noise, cfg, sampler, sigmas, positive, negative, latent_image,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/sample.py", line 65, in sample_custom
samples = comfy.samplers.sample(model, noise, positive, negative, cfg, model.load_device, sampler, sigmas, model_options=model.model_options, latent_image=latent_image, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 1053, in sample
return cfg_guider.sample(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 1035, in sample
output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 997, in outer_sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 980, in inner_sample
samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 752, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/k_diffusion/sampling.py", line 800, in sample_dpmpp_2m
if old_denoised is None or sigmas[i + 1] == 0:
^^^^^^^^^^^^^^^^^^
torch.AcceleratorError: HIP error: an illegal memory access was encountered
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with TORCH_USE_HIP_DSA to enable device-side assertions.
Prompt executed in 96.52 seconds
Exception in thread Thread-4 (prompt_worker):
Traceback (most recent call last):
File "/usr/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
self.run()
File "/usr/lib/python3.12/threading.py", line 1010, in run
self._target(*self._args, **self._kwargs)
File "/home/lasse/ComfyUI/main.py", line 233, in prompt_worker
comfy.model_management.soft_empty_cache()
File "/home/lasse/ComfyUI/comfy/model_management.py", line 1400, in soft_empty_cache
torch.cuda.empty_cache()
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/cuda/memory.py", line 224, in empty_cache
torch._C._cuda_emptyCache()
torch.AcceleratorError: HIP error: an illegal memory access was encountered
HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing AMD_SERIALIZE_KERNEL=3
Compile with TORCH_USE_HIP_DSA to enable device-side assertions.
============================== ComfyUI Diagnostics Report - 20251031_173935
SYSTEM INFO
OS and release: PRETTY_NAME="Ubuntu 24.04.3 LTS" NAME="Ubuntu" VERSION_ID="24.04" VERSION="24.04.3 LTS (Noble Numbat)" VERSION_CODENAME=noble ID=ubuntu ID_LIKE=debian HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" UBUNTU_CODENAME=noble LOGO=ubuntu-logo
Kernel: Linux Odin 6.14.0-34-generic #34~24.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue Sep 23 15:35:20 UTC 2 x86_64 x86_64 x86_64 GNU/Linux
Uptime: 17:39:35 up 30 min, 1 user, load average: 1,11, 1,96, 1,76
GPU / ROCm INFO
rocm-smi output:
========================================= ROCm System Management Interface =========================================
=================================================== Concise Info ===================================================
Device Node IDs Temp Power Partitions SCLK MCLK Fan Perf PwrCap VRAM% GPU%
[3m (DID, GUID) (Edge) (Avg) (Mem, Compute, ID) [0m
0 1 0x7550, 21786 35.0°C 19.0W N/A, N/A, 0 1348Mhz 96Mhz 14.9% auto 280.0W 48% 7%
=============================================== End of ROCm SMI Log ================================================
rocminfo (first 50 lines): [37mROCk module is loaded[0m
HSA System Attributes
Runtime Version: 1.18
Runtime Ext Version: 1.14
System Timestamp Freq.: 1000.000000MHz
Sig. Max Wait Duration: 18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model: LARGE
System Endianness: LITTLE
Mwaitx: DISABLED
XNACK enabled: NO
DMAbuf Support: YES
VMM Support: YES
==========
HSA Agents
Agent 1
Name: AMD Ryzen 9 5900X 12-Core Processor
Uuid: CPU-XX
Marketing Name: AMD Ryzen 9 5900X 12-Core Processor
Vendor Name: CPU
Feature: None specified
Profile: FULL_PROFILE
Float Round Mode: NEAR
Max Queue Number: 0(0x0)
Queue Min Size: 0(0x0)
Queue Max Size: 0(0x0)
Queue Type: MULTI
Node: 0
Device Type: CPU
Cache Info:
L1: 32768(0x8000) KB
Chip ID: 0(0x0)
ASIC Revision: 0(0x0)
Cacheline Size: 64(0x40)
Max Clock Freq. (MHz): 5622
BDFID: 0
Internal Node ID: 0
Compute Unit: 24
SIMDs per CU: 0
Shader Engines: 0
Shader Arrs. per Eng.: 0
WatchPts on Addr. Ranges:1
Memory Properties:
Features: None
Pool Info:
rocminfo failed
lspci | grep VGA: 09:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 48 [Radeon RX 9070/9070 XT/9070 GRE] (rev c0)
ENVIRONMENT VARIABLES (filtered)
Relevant ROCm / HIP / PyTorch vars: PATH=/home/lasse/ComfyUI/.venv/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/snap/bin
PYTHON & VENV INFO
Python binary: /home/lasse/ComfyUI/.venv/bin/python
Python version: Python 3.12.3
Active venv (if any): /home/lasse/ComfyUI/.venv
Python packages (key ones): sys.executable: /home/lasse/ComfyUI/.venv/bin/python sys.path[0:3]: ['', '/usr/lib/python312.zip', '/usr/lib/python3.12'] torch version: 2.8.0+rocm7.1.0.git7a520360 torch.version.hip: 7.1.25424-4179531dcd torch.version.cuda: None torch.cuda.is_available: True
Installed packages (torch, comfy, rocm, pillow, accelerate): torch: 2.8.0+rocm7.1.0.lw.git7a520360 pillow: 12.0.0 accelerate: 1.11.0
COMFYUI INSTALLATION INFO
ComfyUI directory exists: /home/lasse/ComfyUI Git remote and branch: origin https://github.com/comfyanonymous/ComfyUI (fetch) origin https://github.com/comfyanonymous/ComfyUI (push) master
Custom nodes: custom_nodes comfyui-manager js misc notebooks docs glob .github scripts node_db snapshots .cache pycache .git components comfyui-unsafe-torch pycache .git ComfyUI-Easy-Use resources styles ComfyUI-Easy-Use-Frontend .github web_version py tools pycache .git locales wildcards ComfyUI-GGUF .github tools pycache comfyui-impact-pack test example_workflows js troubleshooting modules .github custom_wildcards notebook pycache .git locales wildcards ComfyUI-Crystools nodes samples web general core docs .others .github server pycache .git ComfyUI-MagCache examples assets .github pycache .git ComfyUI_MIGraphX pycache .git workflows Comfy-WaveSpeed assets .github pycache .git workflows comfyui_ultimatesdupscale modules repositories .github pycache ComfyUI-UltimateSDUpscale-GGUF .github pycache ComfyUI-Impact-Subpack modules .github pycache .git comfyui-image-saver js .github pycache saver image-size-tools assets .github pycache pycache ComfyUI-Lora-Manager example_workflows web tests docs static templates .github scripts refs wiki-images py pycache .git locales civitai
SYSTEM LOGS (last 3 minutes)
Collecting journalctl output... okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:8 pasid:32779) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in process python3 pid 8440 thread python3 pid 8440) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in page starting at address 0x00007577645d7000 from client 10 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00841051 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MORE_FAULTS: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: WALKER_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: PERMISSION_FAULTS: 0x5 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MAPPING_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: RW: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:8 pasid:32779) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in process python3 pid 8440 thread python3 pid 8440) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in page starting at address 0x000075777fdd7000 from client 10 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00841051 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MORE_FAULTS: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: WALKER_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: PERMISSION_FAULTS: 0x5 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MAPPING_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: RW: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:8 pasid:32779) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in process python3 pid 8440 thread python3 pid 8440) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in page starting at address 0x0000757766dd7000 from client 10 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00841051 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MORE_FAULTS: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: WALKER_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: PERMISSION_FAULTS: 0x5 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MAPPING_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: RW: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:8 pasid:32779) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in process python3 pid 8440 thread python3 pid 8440) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in page starting at address 0x00007577735d7000 from client 10 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00841051 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MORE_FAULTS: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: WALKER_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: PERMISSION_FAULTS: 0x5 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MAPPING_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: RW: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:8 pasid:32779) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in process python3 pid 8440 thread python3 pid 8440) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in page starting at address 0x0000757761dd7000 from client 10 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00841051 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MORE_FAULTS: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: WALKER_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: PERMISSION_FAULTS: 0x5 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MAPPING_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: RW: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:8 pasid:32779) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in process python3 pid 8440 thread python3 pid 8440) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in page starting at address 0x000075775cdd7000 from client 10 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00841051 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MORE_FAULTS: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: WALKER_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: PERMISSION_FAULTS: 0x5 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MAPPING_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: RW: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:8 pasid:32779) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in process python3 pid 8440 thread python3 pid 8440) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in page starting at address 0x000075775f5d7000 from client 10 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x0084115B okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MORE_FAULTS: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: WALKER_ERROR: 0x5 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: PERMISSION_FAULTS: 0x5 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MAPPING_ERROR: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: RW: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32779) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in process python3 pid 8440 thread python3 pid 8440) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in page starting at address 0x00007577695d7000 from client 10 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00841051 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MORE_FAULTS: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: WALKER_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: PERMISSION_FAULTS: 0x5 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MAPPING_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: RW: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32779) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in process python3 pid 8440 thread python3 pid 8440) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in page starting at address 0x000075775a5d7000 from client 10 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00841051 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MORE_FAULTS: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: WALKER_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: PERMISSION_FAULTS: 0x5 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MAPPING_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: RW: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:8 pasid:32779) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in process python3 pid 8440 thread python3 pid 8440) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: in page starting at address 0x000075777fdeb000 from client 10 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00841051 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MORE_FAULTS: 0x1 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: WALKER_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: PERMISSION_FAULTS: 0x5 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: MAPPING_ERROR: 0x0 okt 31 17:38:59 Odin kernel: amdgpu 0000:09:00.0: amdgpu: RW: 0x1
KERNEL GPU LOGS (amdgpu-related)
Recent dmesg lines with 'amdgpu': No amdgpu messages in dmesg
END OF REPORT
Please try:
export HSA_ENABLE_SDMA=0
And see if the illegal memory accesses keep happening. I think there's a bug in the SDMA according to the journalctl messages, which lets GPU handle memory transfers from RAM. This will probably make diffusion slower but it might be more stable.
KERNEL GPU LOGS (amdgpu-related)
Recent dmesg lines with 'amdgpu': No amdgpu messages in dmesg
Try to check manually. Has the same issue, DMESG will show something like:
[86454.437440] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:8 pasid:32770) [86454.437623] amdgpu 0000:03:00.0: amdgpu: in process python3 pid 688725 thread python3 pid 688725) [86454.437739] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000075c85d640000 from client 10 [86454.437856] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00841051 [86454.437971] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [86454.438086] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [86454.438200] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [86454.438311] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x5 [86454.438421] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [86454.438533] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 [86454.438654] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:8 pasid:32770) [86454.438745] amdgpu 0000:03:00.0: amdgpu: in process python3 pid 688725 thread python3 pid 688725) [86454.438833] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000075c85d641000 from client 10 [86454.438921] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x0084115B [86454.439011] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [86454.439100] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [86454.439190] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x5 [86454.439278] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x5 [86454.439368] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x1 [86454.439457] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 [86454.439550] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32770) [86454.439636] amdgpu 0000:03:00.0: amdgpu: in process python3 pid 688725 thread python3 pid 688725) [86454.439724] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000075c843e7c000 from client 10 [86454.439812] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x0084115B [86454.439900] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: TCP (0x8) [86454.439989] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [86454.440078] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x5 [86454.440165] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x5 [86454.440252] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x1 [86454.440337] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 [86454.440430] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32770) [86454.440519] amdgpu 0000:03:00.0: amdgpu: in process python3 pid 688725 thread python3 pid 688725) [86454.440605] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000075c843e7d000 from client 10 [86454.440699] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32770) [86454.440785] amdgpu 0000:03:00.0: amdgpu: in process python3 pid 688725 thread python3 pid 688725) [86454.440872] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000075c85bf3c000 from client 10 [86454.440970] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32770) [86454.441056] amdgpu 0000:03:00.0: amdgpu: in process python3 pid 688725 thread python3 pid 688725) [86454.441142] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000075c85bf3d000 from client 10 [86454.441236] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:40 vmid:8 pasid:32770) [86454.441322] amdgpu 0000:03:00.0: amdgpu: in process python3 pid 688725 thread python3 pid 688725) [86454.441408] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000075c85d643000 from client 10 [86454.441505] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32770) [86454.441592] amdgpu 0000:03:00.0: amdgpu: in process python3 pid 688725 thread python3 pid 688725) [86454.441679] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000075c843e7f000 from client 10 [86454.441773] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:173 vmid:8 pasid:32770) [86454.441860] amdgpu 0000:03:00.0: amdgpu: in process python3 pid 688725 thread python3 pid 688725) [86454.441947] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000075c85bf3f000 from client 10 [86454.442041] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:24 vmid:8 pasid:32770) [86454.442128] amdgpu 0000:03:00.0: amdgpu: in process python3 pid 688725 thread python3 pid 688725) [86454.442214] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x000075cccdfff000 from client 10 [86454.447455] workqueue: amdgpu_irq_handle_ih1 [amdgpu] hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND
Please try:
export HSA_ENABLE_SDMA=0And see if the illegal memory accesses keep happening. I think there's a bug in the SDMA according to the journalctl messages, which lets GPU handle memory transfers from RAM. This will probably make diffusion slower but it might be more stable.
Seems it is at least getting better. RX9070/Rocm7.1/Pytorch Rocm 7 nightly from pytorch.org With Qwen-image I was getting this error after 2-3 generations, and now had 6 in a row just fine. Although, had some other exported variables for test, let me try a fresh console and leave it with 'HSA_ENABLE_SDMA=0' only.
Performance impact is neglectable, about 0.05s/it. Assuming that ROCM 7 pytorch is WAY faster than 6.x here - I'd say it absolutely worth the hassle.
Upd: no, just "getting better". Still getting the error, just after a while. And my variables set was: declare -x HSA_ENABLE_SDMA="0" declare -x PYTORCH_ALLOC_CONF="max_split_size_mb:512" declare -x PYTORCH_HIP_ALLOC_CONF="max_split_size_mb:512" declare -x PYTORCH_HIP_FORCE_SHUTDOWN="1"
HSA_ENABLE_SDMA alone does not make the trick.
well... no dice.. now I cannot make a single picture.. it failes during first Ksampler.. every time.
Traceback (most recent call last):
File "/home/lasse/ComfyUI/execution.py", line 510, in execute
output_data, output_ui, has_subgraph, has_pending_tasks = await get_output_data(prompt_id, unique_id, obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/execution.py", line 324, in get_output_data
return_values = await _async_map_node_over_list(prompt_id, unique_id, obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb, hidden_inputs=hidden_inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/ComfyUI-Lora-Manager/py/metadata_collector/metadata_hook.py", line 165, in async_map_node_over_list_with_metadata
results = await original_map_node_over_list(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/execution.py", line 298, in _async_map_node_over_list
await process_inputs(input_dict, i)
File "/home/lasse/ComfyUI/execution.py", line 286, in process_inputs
result = f(**inputs)
^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy_extras/nodes_custom_sampler.py", line 835, in sample
samples = guider.sample(noise.generate_noise(latent), latent_image, sampler, sigmas, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=noise.seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 1035, in sample
output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 997, in outer_sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed, latent_shapes=latent_shapes)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 980, in inner_sample
samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 752, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/k_diffusion/sampling.py", line 199, in sample_euler
denoised = model(x, sigma_hat * s_in, **extra_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 401, in call
out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 953, in call
return self.outer_predict_noise(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 960, in outer_predict_noise
).execute(x, timestep, model_options, seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 963, in predict_noise
return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 381, in sampling_function
out = calc_cond_batch(model, conds, x, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 206, in calc_cond_batch
return _calc_cond_batch_outer(model, conds, x_in, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 214, in _calc_cond_batch_outer
return executor.execute(model, conds, x_in, timestep, model_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/samplers.py", line 324, in calc_cond_batch
output = model_options['model_function_wrapper'](model.apply_model, {"input": input_x, "timestep": timestep, "c": c, "cond_or_uncond": cond_or_uncond}).chunk(batch_chunks)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/ComfyUI-MagCache/nodes.py", line 807, in unet_wrapper_function
return model_function(input, timestep, **c)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/model_base.py", line 161, in apply_model
return comfy.patcher_extension.WrapperExecutor.new_class_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/model_base.py", line 203, in _apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/ldm/chroma/model.py", line 261, in forward
return comfy.patcher_extension.WrapperExecutor.new_class_executor(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/patcher_extension.py", line 112, in execute
return self.original(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/comfy/ldm/chroma/model.py", line 284, in _forward
out = self.forward_orig(img, img_ids, context, txt_ids, timestep, guidance, control, transformer_options, attn_mask=kwargs.get("attention_mask", None))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/lasse/ComfyUI/custom_nodes/ComfyUI-MagCache/nodes.py", line 680, in magcache_chroma_forward
self.residual_cache[self.cnt%2] = (img - ori_img).to(mm.unet_offload_device())
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.AcceleratorError: HIP error: an illegal memory access was encountered
Compile with TORCH_USE_HIP_DSA to enable device-side assertions.
Prompt executed in 52.77 seconds
Exception in thread Thread-6 (prompt_worker):
Traceback (most recent call last):
File "/usr/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
self.run()
File "/usr/lib/python3.12/threading.py", line 1010, in run
self._target(*self._args, **self._kwargs)
File "/home/lasse/ComfyUI/main.py", line 242, in prompt_worker
comfy.model_management.soft_empty_cache()
File "/home/lasse/ComfyUI/comfy/model_management.py", line 1432, in soft_empty_cache
torch.cuda.empty_cache()
File "/home/lasse/ComfyUI/.venv/lib/python3.12/site-packages/torch/cuda/memory.py", line 224, in empty_cache
torch._C._cuda_emptyCache()
torch.AcceleratorError: HIP error: an illegal memory access was encountered
Compile with TORCH_USE_HIP_DSA to enable device-side assertions.
Would it make any difference, or help anything, if I could make a description of my linux setup, including variables and the like ?? would it help anything ?
nov 06 06:01:00 Odin systemd[1]: tmp-snap.rootfs_0VjBMz.mount: Deactivated successfully. nov 06 06:01:01 Odin kernel: audit: type=1400 audit(1762405261.015:153): apparmor="DENIED" operation="open" class="file" profile="snap.f> nov 06 06:01:01 Odin systemd[2257]: snap.firmware-updater.firmware-notifier.service: Consumed 1.374s CPU time. nov 06 06:01:01 Odin kernel: amdgpu 0000:09:00.0: amdgpu: 000000005abb2ae3 pin failed nov 06 06:01:01 Odin kernel: [drm:amdgpu_dm_plane_helper_prepare_fb [amdgpu]] ERROR Failed to pin framebuffer with error -12 nov 06 06:01:01 Odin gnome-shell[2562]: Page flip failed: drmModeAtomicCommit: Kan ikke tildele hukommelse nov 06 06:01:01 Odin kernel: workqueue: svm_range_restore_work [amdgpu] hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND nov 06 06:01:01 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:01 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:01 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:01 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:01 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: amdgpu 0000:09:00.0: amdgpu: 0000000085653ad3 pin failed nov 06 06:01:02 Odin kernel: [drm:amdgpu_dm_plane_helper_prepare_fb [amdgpu]] ERROR Failed to pin framebuffer with error -12 nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin gnome-shell[2562]: Page flip failed: drmModeAtomicCommit: Kan ikke tildele hukommelse nov 06 06:01:02 Odin gnome-shell[2562]: Page flip failed: drmModeAtomicCommit: Kan ikke tildele hukommelse nov 06 06:01:02 Odin kernel: amdgpu 0000:09:00.0: amdgpu: 0000000064409088 pin failed nov 06 06:01:02 Odin kernel: [drm:amdgpu_dm_plane_helper_prepare_fb [amdgpu]] ERROR Failed to pin framebuffer with error -12 nov 06 06:01:02 Odin kernel: workqueue: svm_range_restore_work [amdgpu] hogged CPU for >10000us 7 times, consider switching to WQ_UNBOUND nov 06 06:01:02 Odin xdg-desktop-por[3215]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin xdg-desktop-por[3253]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gnome-shell[4791]: (EE) failed to read Wayland events: Broken pipe nov 06 06:01:02 Odin update-notifier[5080]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gsd-wacom[2791]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gsd-media-keys[2762]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gsd-power[2764]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin brave[4519]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gsd-color[2746]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin evolution-alarm[5374]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gsd-keyboard[2760]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin evolution-alarm[2875]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin systemd[2257]: org.gnome.SettingsDaemon.Color.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin polkitd[999]: Unregistered Authentication Agent for unix-session:2 (system bus name :1.79, object path /org/freedes> nov 06 06:01:02 Odin gnome-terminal-[4452]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gnome-software[2857]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin solaar[3070]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin systemd[2257]: [email protected]: Main process exited, code=killed, status=9/KILL nov 06 06:01:02 Odin systemd[2257]: org.gnome.SettingsDaemon.Keyboard.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin systemd[2257]: org.gnome.SettingsDaemon.MediaKeys.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin systemd[2257]: org.gnome.SettingsDaemon.Power.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin kernel: amdgpu 0000:09:00.0: amdgpu: VM memory stats for proc xdg-desktop-por(6223) task xdg-deskto:cs0(3215) is no> nov 06 06:01:02 Odin systemd[2257]: org.gnome.SettingsDaemon.Wacom.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin systemd[2257]: xdg-desktop-portal-gnome.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin systemd[2257]: xdg-desktop-portal-gnome.service: Failed with result 'exit-code'. nov 06 06:01:02 Odin systemd[2257]: xdg-desktop-portal-gnome.service: Consumed 1.189s CPU time. nov 06 06:01:02 Odin systemd[2257]: xdg-desktop-portal-gtk.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin systemd[2257]: xdg-desktop-portal-gtk.service: Failed with result 'exit-code'. nov 06 06:01:02 Odin systemd[2257]: gnome-terminal-server.service: Main process exited, code=exited, status=1/FAILURE lines 1-55...skipping... nov 06 06:01:00 Odin systemd[1]: tmp-snap.rootfs_0VjBMz.mount: Deactivated successfully. nov 06 06:01:01 Odin kernel: audit: type=1400 audit(1762405261.015:153): apparmor="DENIED" operation="open" class="file" profile="snap.f> nov 06 06:01:01 Odin systemd[2257]: snap.firmware-updater.firmware-notifier.service: Consumed 1.374s CPU time. nov 06 06:01:01 Odin kernel: amdgpu 0000:09:00.0: amdgpu: 000000005abb2ae3 pin failed nov 06 06:01:01 Odin kernel: [drm:amdgpu_dm_plane_helper_prepare_fb [amdgpu]] ERROR Failed to pin framebuffer with error -12 nov 06 06:01:01 Odin gnome-shell[2562]: Page flip failed: drmModeAtomicCommit: Kan ikke tildele hukommelse nov 06 06:01:01 Odin kernel: workqueue: svm_range_restore_work [amdgpu] hogged CPU for >10000us 5 times, consider switching to WQ_UNBOUND nov 06 06:01:01 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:01 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:01 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:01 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:01 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: amdgpu 0000:09:00.0: amdgpu: 0000000085653ad3 pin failed nov 06 06:01:02 Odin kernel: [drm:amdgpu_dm_plane_helper_prepare_fb [amdgpu]] ERROR Failed to pin framebuffer with error -12 nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 06 06:01:02 Odin gnome-shell[2562]: Page flip failed: drmModeAtomicCommit: Kan ikke tildele hukommelse nov 06 06:01:02 Odin gnome-shell[2562]: Page flip failed: drmModeAtomicCommit: Kan ikke tildele hukommelse nov 06 06:01:02 Odin kernel: amdgpu 0000:09:00.0: amdgpu: 0000000064409088 pin failed nov 06 06:01:02 Odin kernel: [drm:amdgpu_dm_plane_helper_prepare_fb [amdgpu]] ERROR Failed to pin framebuffer with error -12 nov 06 06:01:02 Odin kernel: workqueue: svm_range_restore_work [amdgpu] hogged CPU for >10000us 7 times, consider switching to WQ_UNBOUND nov 06 06:01:02 Odin xdg-desktop-por[3215]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin xdg-desktop-por[3253]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gnome-shell[4791]: (EE) failed to read Wayland events: Broken pipe nov 06 06:01:02 Odin update-notifier[5080]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gsd-wacom[2791]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gsd-media-keys[2762]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gsd-power[2764]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin brave[4519]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gsd-color[2746]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin evolution-alarm[5374]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gsd-keyboard[2760]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin evolution-alarm[2875]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin systemd[2257]: org.gnome.SettingsDaemon.Color.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin polkitd[999]: Unregistered Authentication Agent for unix-session:2 (system bus name :1.79, object path /org/freedes> nov 06 06:01:02 Odin gnome-terminal-[4452]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin gnome-software[2857]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin solaar[3070]: Error reading events from display: Kanalen blev brudt nov 06 06:01:02 Odin systemd[2257]: [email protected]: Main process exited, code=killed, status=9/KILL nov 06 06:01:02 Odin systemd[2257]: org.gnome.SettingsDaemon.Keyboard.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin systemd[2257]: org.gnome.SettingsDaemon.MediaKeys.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin systemd[2257]: org.gnome.SettingsDaemon.Power.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin kernel: amdgpu 0000:09:00.0: amdgpu: VM memory stats for proc xdg-desktop-por(6223) task xdg-deskto:cs0(3215) is no> nov 06 06:01:02 Odin systemd[2257]: org.gnome.SettingsDaemon.Wacom.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin systemd[2257]: xdg-desktop-portal-gnome.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin systemd[2257]: xdg-desktop-portal-gnome.service: Failed with result 'exit-code'. nov 06 06:01:02 Odin systemd[2257]: xdg-desktop-portal-gnome.service: Consumed 1.189s CPU time. nov 06 06:01:02 Odin systemd[2257]: xdg-desktop-portal-gtk.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin systemd[2257]: xdg-desktop-portal-gtk.service: Failed with result 'exit-code'. nov 06 06:01:02 Odin systemd[2257]: gnome-terminal-server.service: Main process exited, code=exited, status=1/FAILURE nov 06 06:01:02 Odin systemd[2257]: gnome-terminal-server.service: Failed with result 'exit-code'.
There's a workaround... I suspect it's a driver issue:
https://github.com/ROCm/TheRock/issues/1795#issuecomment-3519877539
I have now tried the workaround.. I can now make 2 pictures, but the 3rd gives this error message: ov 12 20:07:55 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:07:55 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:07:55 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:07:55 Odin kernel: amdgpu 0000:09:00.0: amdgpu: 00000000ff287153 pin failed nov 12 20:07:55 Odin kernel: [drm:amdgpu_dm_plane_helper_prepare_fb [amdgpu]] ERROR Failed to pin framebuffer with error -12 nov 12 20:07:55 Odin gnome-shell[2586]: Page flip failed: drmModeAtomicCommit: Kan ikke tildele hukommelse nov 12 20:07:55 Odin kernel: amdgpu 0000:09:00.0: amdgpu: 00000000cddb2b91 pin failed nov 12 20:07:55 Odin kernel: [drm:amdgpu_dm_plane_helper_prepare_fb [amdgpu]] ERROR Failed to pin framebuffer with error -12 nov 12 20:07:55 Odin gnome-shell[2586]: Page flip failed: drmModeAtomicCommit: Kan ikke tildele hukommelse nov 12 20:07:56 Odin gnome-shell[2586]: Page flip failed: drmModeAtomicCommit: Kan ikke tildele hukommelse nov 12 20:07:56 Odin kernel: amdgpu 0000:09:00.0: amdgpu: 00000000ff287153 pin failed nov 12 20:07:56 Odin kernel: [drm:amdgpu_dm_plane_helper_prepare_fb [amdgpu]] ERROR Failed to pin framebuffer with error -12 nov 12 20:08:03 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:03 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:04 Odin kernel: amdgpu 0000:09:00.0: amdgpu: 00000000f5ac28ec pin failed nov 12 20:08:04 Odin kernel: [drm:amdgpu_dm_plane_helper_prepare_fb [amdgpu]] ERROR Failed to pin framebuffer with error -12 nov 12 20:08:04 Odin gnome-shell[2586]: Page flip failed: drmModeAtomicCommit: Kan ikke tildele hukommelse nov 12 20:08:10 Odin gnome-shell[2586]: Page flip failed: drmModeAtomicCommit: Kan ikke tildele hukommelse nov 12 20:08:10 Odin kernel: amdgpu 0000:09:00.0: amdgpu: 00000000ff287153 pin failed nov 12 20:08:10 Odin kernel: [drm:amdgpu_dm_plane_helper_prepare_fb [amdgpu]] ERROR Failed to pin framebuffer with error -12 nov 12 20:08:11 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:11 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:11 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:11 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:11 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:11 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:11 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:11 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:11 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:11 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:11 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:11 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:08:11 Odin kernel: amdgpu 0000:09:00.0: amdgpu: 00000000f5ac28ec pin failed nov 12 20:08:11 Odin kernel: [drm:amdgpu_dm_plane_helper_prepare_fb [amdgpu]] ERROR Failed to pin framebuffer with error -12 nov 12 20:08:11 Odin gnome-shell[2586]: Page flip failed: drmModeAtomicCommit: Kan ikke tildele hukommelse nov 12 20:08:13 Odin kernel: [drm:amdgpu_cs_ioctl [amdgpu]] ERROR Not enough memory for command submission! nov 12 20:09:24 Odin gnome-shell[2586]: GNOME Shell crashed with signal 11 nov 12 20:09:24 Odin gnome-shell[2586]: == Stack trace for context 0x5868327d4830 ==
Correction.. after reinstalling ubuntu and making a clean install, removing most of my startupflags and applying the grub changes... the comfyui have become stable, bot slower.. but its a clear improvement