parryppp
parryppp
@Z-YuPeng can we build some relationship between the background to the noised img, for example, during the denoising process, a model can identify the background tokens?
The reshaped attention mask is shown above. Do you mean that, for example, if i want to generate 4 consistent images, the yellow zone in the attention map would not...
Thank you for your explanation, I now understand much more clearly. But I still have a question about the shape of attention mask. why does the attention mask ensure that...
> We haven't tried the method used in AVID but we found slidding windows works well. would you like to share the details of slidding windows? for example, given a...
除了tenporal latents, env latents似乎也都没有集成? > [@Artiprocher](https://github.com/Artiprocher) 能否简单说明集成temporal latents后的主要问题,我按照wan2.2官方的推理代码中的实现(https://github.com/Wan-Video/Wan2.2/blob/990af50de458c19590c245151197326e208d7191/wan/animate.py#L522) 将y变量变为参考+时序+生成片段的拼接,在推理结果上出现明显噪声。 感谢
@zwplus thank you for your reply. I'm a bit confused — after this change, does it still end up with a latent of 81-frame input video matching against a latent...