LightX2V
LightX2V copied to clipboard
[Bug] no kernel image is available when running T5
Description
I face the torch.AcceleratorError: CUDA error: no kernel image is available error when running the provided script bash wan22/run_wan22_moe_i2v_distill.sh
Steps to Reproduce
- Pull the official docker image
- Setup the code base
- run
bash wan22/run_wan22_moe_i2v_distill.shafter changing the config_json towan_moe_i2v_distill_4090.json
Expected Result
Model should create the video
Actual Result
2025-10-31 17:18:53.129 | INFO | lightx2v.utils.utils:load_weights:384 - Loading weights from /workspace/models/wan2.2_i2v/models_t5_umt5-xxl-enc-fp8.pth
2025-10-31 17:18:57.223 | INFO | lightx2v.utils.utils:load_weights:384 - Loading weights from /workspace/models/wan2.2_i2v/Wan2.1_VAE.pth
2025-10-31 17:18:57.536 | INFO | lightx2v.utils.profiler:__exit__:43 - [Profile] Single GPU - Level2_Log Load models cost 24.113897 seconds
2025-10-31 17:19:02.661 | INFO | lightx2v.utils.profiler:__exit__:43 - [Profile] Single GPU - Level1_Log Run VAE Encoder cost 5.112839 seconds
2025-10-31 17:19:02.753 | INFO | lightx2v.utils.profiler:__exit__:43 - [Profile] Single GPU - Level1_Log Run Text Encoder cost 0.091370 seconds
2025-10-31 17:19:02.753 | INFO | lightx2v.utils.profiler:__exit__:43 - [Profile] Single GPU - Level2_Log Run Encoders cost 5.216829 seconds
2025-10-31 17:19:02.753 | INFO | lightx2v.utils.profiler:__exit__:43 - [Profile] Single GPU - Level1_Log RUN pipeline cost 5.216864 seconds
2025-10-31 17:19:02.753 | INFO | lightx2v.utils.profiler:__exit__:43 - [Profile] Single GPU - Level1_Log Total Cost cost 29.331168 seconds
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/workspace/lightx2v/infer.py", line 115, in <module>
main()
File "/workspace/lightx2v/infer.py", line 106, in main
runner.run_pipeline(input_info)
File "/workspace/lightx2v/utils/profiler.py", line 77, in sync_wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/workspace/lightx2v/models/runners/default_runner.py", line 364, in run_pipeline
self.inputs = self.run_input_encoder()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/lightx2v/utils/profiler.py", line 77, in sync_wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/workspace/lightx2v/models/runners/default_runner.py", line 205, in _run_input_encoder_local_i2v
text_encoder_output = self.run_text_encoder(self.input_info)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/lightx2v/utils/profiler.py", line 77, in sync_wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/workspace/lightx2v/models/runners/wan/wan_runner.py", line 238, in run_text_encoder
context = self.text_encoders[0].infer([prompt])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/lightx2v/models/input_encoders/hf/wan/t5/model.py", line 609, in infer
context = self.model(ids, mask)
^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/lightx2v/models/input_encoders/hf/wan/t5/model.py", line 351, in forward
x = block(x, mask, pos_bias=e)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/lightx2v/models/input_encoders/hf/wan/t5/model.py", line 220, in forward
x = fp16_clamp(x + self.attn(self.norm1(x), mask=mask, pos_bias=e))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1773, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1784, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/workspace/lightx2v/models/input_encoders/hf/wan/t5/model.py", line 125, in forward
attn_bias = x.new_zeros(b, n, q.size(1), k.size(1))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.AcceleratorError: CUDA error: no kernel image is available for execution on the device
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Environment Information
- Official docker image
- RTX 5090