DiffSynth-Studio wan-animated推理训练数据长度问题

为什么Wan-animated推理代码和训练代码中的conds的长度都为生成视频长度-4？ https://github.com/modelscope/DiffSynth-Studio/blob/0a1c172a00fb2dd76abedd3b066ddbf62bd4a60d/examples/wanvideo/model_training/validate_lora/Wan2.2-Animate-14B.py#L21 https://github.com/modelscope/DiffSynth-Studio/blob/0a1c172a00fb2dd76abedd3b066ddbf62bd4a60d/diffsynth/pipelines/wan_video_new.py#L1072
Wan-animated中训练和推理均没有实现论文中的overlap策略？

Oct 17 '25 07:10 Guan-chen-lu

@Guan-chen-lu

这是模型的结构决定的
没有，这会导致框架的一系列衍生问题，我们暂时不会将这个逻辑集成到框架中

Oct 20 '25 03:10 Artiprocher

@Artiprocher 能否简单说明集成temporal latents后的主要问题，我按照wan2.2官方的推理代码中的实现(https://github.com/Wan-Video/Wan2.2/blob/990af50de458c19590c245151197326e208d7191/wan/animate.py#L522) 将y变量变为参考+时序+生成片段的拼接，在推理结果上出现明显噪声。感谢

Nov 07 '25 12:11 Guan-chen-lu

除了tenporal latents, env latents似乎也都没有集成？

@Artiprocher 能否简单说明集成temporal latents后的主要问题，我按照wan2.2官方的推理代码中的实现(https://github.com/Wan-Video/Wan2.2/blob/990af50de458c19590c245151197326e208d7191/wan/animate.py#L522) 将y变量变为参考+时序+生成片段的拼接，在推理结果上出现明显噪声。感谢

Nov 13 '25 02:11 parryppp