TeaCache
TeaCache copied to clipboard
Is there an example in wan2.1 to add teacache based on xdit usp parallel computing?
You can refer to Teacache-xDiT.
The following is the error message, how should I modify it?
[rank1]: Traceback (most recent call last):
[rank1]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1462, in <module>
[rank1]: generate(args)
[rank1]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1419, in generate
[rank1]: video = wan_i2v.generate(
[rank1]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 310, in i2v_generate
[rank1]: context = self.text_encoder([input_prompt], self.device)
[rank1]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan/modules/t5.py", line 512, in __call__
[rank1]: context = self.model(ids, mask)
[rank1]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
[rank1]: return self._call_impl(*args, **kwargs)
[rank1]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
[rank1]: return forward_call(*args, **kwargs)
[rank1]: TypeError: usp_teacache_forward() missing 2 required positional arguments: 'context' and 'seq_len'
[rank6]: Traceback (most recent call last):
[rank6]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1462, in <module>
[rank6]: generate(args)
[rank6]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1419, in generate
[rank6]: video = wan_i2v.generate(
[rank6]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 310, in i2v_generate
[rank6]: context = self.text_encoder([input_prompt], self.device)
[rank6]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan/modules/t5.py", line 512, in __call__
[rank6]: context = self.model(ids, mask)
[rank6]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
[rank6]: return self._call_impl(*args, **kwargs)
[rank6]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
[rank6]: return forward_call(*args, **kwargs)
[rank6]: TypeError: usp_teacache_forward() missing 2 required positional arguments: 'context' and 'seq_len'
[rank5]: Traceback (most recent call last):
[rank5]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1462, in <module>
[rank5]: generate(args)
[rank5]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1419, in generate
[rank5]: video = wan_i2v.generate(
[rank5]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 310, in i2v_generate
[rank5]: context = self.text_encoder([input_prompt], self.device)
[rank5]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan/modules/t5.py", line 512, in __call__
[rank5]: context = self.model(ids, mask)
[rank5]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
[rank5]: return self._call_impl(*args, **kwargs)
[rank5]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
[rank5]: return forward_call(*args, **kwargs)
[rank5]: TypeError: usp_teacache_forward() missing 2 required positional arguments: 'context' and 'seq_len'
[rank3]: Traceback (most recent call last):
[rank3]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1462, in <module>
[rank3]: generate(args)
[rank3]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1419, in generate
[rank3]: video = wan_i2v.generate(
[rank3]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 310, in i2v_generate
[rank3]: context = self.text_encoder([input_prompt], self.device)
[rank3]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan/modules/t5.py", line 512, in __call__
[rank3]: context = self.model(ids, mask)
[rank3]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
[rank3]: return self._call_impl(*args, **kwargs)
[rank3]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
[rank3]: return forward_call(*args, **kwargs)
[rank3]: TypeError: usp_teacache_forward() missing 2 required positional arguments: 'context' and 'seq_len'
[rank4]: Traceback (most recent call last):
[rank4]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1462, in <module>
[rank4]: generate(args)
[rank4]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1419, in generate
[rank4]: video = wan_i2v.generate(
[rank4]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 310, in i2v_generate
[rank4]: context = self.text_encoder([input_prompt], self.device)
[rank4]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan/modules/t5.py", line 512, in __call__
[rank4]: context = self.model(ids, mask)
[rank4]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
[rank4]: return self._call_impl(*args, **kwargs)
[rank4]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
[rank4]: return forward_call(*args, **kwargs)
[rank4]: TypeError: usp_teacache_forward() missing 2 required positional arguments: 'context' and 'seq_len'
[rank7]: Traceback (most recent call last):
[rank7]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1462, in <module>
[rank7]: generate(args)
[rank7]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1419, in generate
[rank7]: video = wan_i2v.generate(
[rank7]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 310, in i2v_generate
[rank7]: context = self.text_encoder([input_prompt], self.device)
[rank7]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan/modules/t5.py", line 512, in __call__
[rank7]: context = self.model(ids, mask)
[rank7]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
[rank7]: return self._call_impl(*args, **kwargs)
[rank7]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
[rank7]: return forward_call(*args, **kwargs)
[rank7]: TypeError: usp_teacache_forward() missing 2 required positional arguments: 'context' and 'seq_len'
[rank0]: Traceback (most recent call last):
[rank0]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1462, in <module>
[rank0]: generate(args)
[rank0]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1419, in generate
[rank0]: video = wan_i2v.generate(
[rank0]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 310, in i2v_generate
[rank0]: context = self.text_encoder([input_prompt], self.device)
[rank0]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan/modules/t5.py", line 512, in __call__
[rank0]: context = self.model(ids, mask)
[rank0]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
[rank0]: return self._call_impl(*args, **kwargs)
[rank0]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
[rank0]: return forward_call(*args, **kwargs)
[rank0]: TypeError: usp_teacache_forward() missing 2 required positional arguments: 'context' and 'seq_len'
[rank2]: Traceback (most recent call last):
[rank2]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1462, in <module>
[rank2]: generate(args)
[rank2]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 1419, in generate
[rank2]: video = wan_i2v.generate(
[rank2]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan21_teacache.py", line 310, in i2v_generate
[rank2]: context = self.text_encoder([input_prompt], self.device)
[rank2]: File "/sgl-workspace/xuzhezhe/Wan2.1/wan/modules/t5.py", line 512, in __call__
[rank2]: context = self.model(ids, mask)
[rank2]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
[rank2]: return self._call_impl(*args, **kwargs)
[rank2]: File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1747, in _call_impl
[rank2]: return forward_call(*args, **kwargs)
[rank2]: TypeError: usp_teacache_forward() missing 2 required positional arguments: 'context' and 'seq_len'
[rank0]:[W429 04:40:23.236914761 ProcessGroupNCCL.cpp:1250] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator())
W0429 04:40:24.955000 30340 torch/distributed/elastic/multiprocessing/api.py:897] Sending process 30405 closing signal SIGTERM
W0429 04:40:24.955000 30340 torch/distributed/elastic/multiprocessing/api.py:897] Sending process 30406 closing signal SIGTERM
W0429 04:40:24.956000 30340 torch/distributed/elastic/multiprocessing/api.py:897] Sending process 30408 closing signal SIGTERM
W0429 04:40:24.957000 30340 torch/distributed/elastic/multiprocessing/api.py:897] Sending process 30409 closing signal SIGTERM
W0429 04:40:24.957000 30340 torch/distributed/elastic/multiprocessing/api.py:897] Sending process 30410 closing signal SIGTERM
W0429 04:40:24.957000 30340 torch/distributed/elastic/multiprocessing/api.py:897] Sending process 30411 closing signal SIGTERM
W0429 04:40:24.958000 30340 torch/distributed/elastic/multiprocessing/api.py:897] Sending process 30412 closing signal SIGTERM
E0429 04:40:25.787000 30340 torch/distributed/elastic/multiprocessing/api.py:869] failed (exitcode: 1) local_rank: 2 (pid: 30407) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/usr/local/bin/torchrun", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
return f(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 919, in main
run(args)
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/run.py", line 910, in run
elastic_launch(
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 138, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/usr/local/lib/python3.10/dist-packages/torch/distributed/launcher/api.py", line 269, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
wan21_teacache.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2025-04-29_04:40:24
host : 258b0e4d5b77
rank : 2 (local_rank: 2)
exitcode : 1 (pid: 30407)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
I've encountered the same problem. Have you solved it?
same questions, how you solved it?