SVD_Xtend icon indicating copy to clipboard operation
SVD_Xtend copied to clipboard

Stable Video Diffusion Training Code and Extensions.

Results 34 SVD_Xtend issues
Sort by recently updated
recently updated
newest added

请问一下这个是svd_xt_image_decoder.safetensors吗,为什么我加上路径之后报错我配置的参数是pretrained_model_name_or_path="/SVD_Xtend-main/svd_xt_image_decoder.safetensors"然后就报错 raise EnvironmentError(f"It looks like the config file at '{config_file}' is not a valid JSON file.") OSError: It looks like the config file at '/SVD_Xtend-main/svd_xt_image_decoder.safetensors' is not a valid JSON...

when it runs: model_pred = unet( inp_noisy_latents, timesteps, encoder_hidden_states, added_time_ids=added_time_ids).sample The error is "RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 6 but got size 5...

Hi, Thanks for any suggestions. The largest resolution that could be used for training is 512 × 512 with ~76G memory cost. I set the enable_xformers_memory_efficient_attention to True but nothing...

Traceback (most recent call last): File "train_svd.py", line 1262, in main() File "train_svd.py", line 1089, in main model_pred = unet( File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(*args, **kwargs) File...

add VideoDummyDataset to load video directly

Thans for you amazing Job! At the line 1017 of train_svd.py,where the batch_size freeze as 1(noise_aug_strength = cond_sigmas[0] # TODO: support batch > 1)? What should I do to support...

WIP, Not recommended for use, as there are many unresolved issues.

Hi, While exploring diffusion models, I noticed the standard forward pass often uses the formula $\alpha \cdot x + \sigma \cdot \epsilon$. However, in your video diffusion model code, I...

![image](https://github.com/pixeli99/SVD_Xtend/assets/23293902/7563b4dd-d1c3-4a1c-9c48-b4c252cd91a8)