Possible to run Wan2.1 on RTX 2060ti?
I have been trying to follow the guides and got to about this point
https://github.com/modelscope/DiffSynth-Studio/tree/main/examples/wanvideo
I am running into the issue
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 47.98 GiB.
Currently running this script
import torch
from diffsynth import ModelManager, WanVideoPipeline, save_video, VideoData
from modelscope import snapshot_download
# Download models
snapshot_download("Wan-AI/Wan2.1-T2V-1.3B", local_dir="models/Wan-AI/Wan2.1-T2V-1.3B")
# Load models
model_manager = ModelManager(device="cpu")
model_manager.load_models(
[
"models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors",
"models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth",
"models/Wan-AI/Wan2.1-T2V-1.3B/Wan2.1_VAE.pth",
],
torch_dtype=torch.bfloat16, # You can set `torch_dtype=torch.float8_e4m3fn` to enable FP8 quantization.
)
pipe = WanVideoPipeline.from_model_manager(model_manager, torch_dtype=torch.bfloat16, device="cuda")
pipe.enable_vram_management(num_persistent_param_in_dit=None)
# Text-to-video
video = pipe(
prompt="bright sunshiny day",
negative_prompt="dark",
num_inference_steps=50,
seed=0, tiled=True
)
save_video(video, "video1.mp4", fps=15, quality=5)
Try setting num_persistent_param_in_dit to 0, instead of None.
Also try setting torch_dtype to float8_e4m3fn
unfortunately neither fixed this issue; setting to 0 instead of None has same results; and when setting the torch_dtype it actually kills much earlier in the startup
similar error with changing 0/None
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 47.98 GiB.
dtype change with or without the 0/None change
Downloading Model to directory: /home/Owner/Wan2.1/DiffSynth-Studio/models/Wan-AI/Wan2.1-T2V-1.3B
2025-03-04 18:59:17,506 - modelscope - INFO - Target directory already exists, skipping creation.
Loading models from: models/Wan-AI/Wan2.1-T2V-1.3B/diffusion_pytorch_model.safetensors
model_name: wan_video_dit model_class: WanModel
This model is initialized with extra kwargs: {'model_type': 't2v', 'patch_size': (1, 2, 2), 'text_len': 512, 'in_dim': 16, 'dim': 1536, 'ffn_dim': 8960, 'freq_dim': 256, 'text_dim': 4096, 'out_dim': 16, 'num_heads': 12, 'num_layers': 30, 'window_size': (-1, -1), 'qk_norm': True, 'cross_attn_norm': True, 'eps': 1e-06}
The following models are loaded: ['wan_video_dit'].
Loading models from: models/Wan-AI/Wan2.1-T2V-1.3B/models_t5_umt5-xxl-enc-bf16.pth
model_name: wan_video_text_encoder model_class: WanTextEncoder
Killed
@centuralcreations You can set both options. If it still shows "killed", it means your RAM is not enough.