[Bug] size mismatch for FastWan2.2-TI2V-5B-FullAttn-Diffusers

Open Lifedecoder opened this issue 1 month ago • 1 comments

Describe the bug

File "/users/miniconda3/envs/fastvideo/lib/python3.12/site-packages/fastvideo/models/utils.py", line 159, in pred_noise_to_pred_video (FVWorkerProc-0 pid=3599678) ERROR 11-04 10:05:23 [gpu_worker.py:226] pred_video = noise_input_latent - sigma_t * pred_noise (FVWorkerProc-0 pid=3599678) ERROR 11-04 10:05:23 [gpu_worker.py:226] ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~ (FVWorkerProc-0 pid=3599678) ERROR 11-04 10:05:23 [gpu_worker.py:226] RuntimeError: The size of tensor a (53) must match the size of tensor b (52) at non-singleton dimension 3

Reproduction

import os from fastvideo import VideoGenerator

def main(): os.environ["FASTVIDEO_ATTENTION_BACKEND"] = "VIDEO_SPARSE_ATTN"

# Create a video generator with a pre-trained model
generator = VideoGenerator.from_pretrained(
    "FastWan2.2-TI2V-5B-FullAttn-Diffusers",
    num_gpus=1,  # Adjust based on your hardware
)

# Define a prompt for your video
prompt = "A curious raccoon peers through a vibrant field of yellow sunflowers, its eyes wide with interest."

# Generate the video
video = generator.generate_video(
    prompt,
    return_frames=True,  # Also return frames from this call (defaults to False)
    output_path="output_t2v_pri/",  # Controls where videos are saved
    save_video=True,
    height=480,
    width=848,
)

if name == 'main': main()

Environment

fastvideo 1.6.0

Nov 04 '25 10:11 Lifedecoder

And also, is that FastWan2.2-TI2V-5B-FullAttn-Diffusers actually doesn't support IT2V but only T2V?

Nov 04 '25 10:11 Lifedecoder