DiffSynth-Studio
DiffSynth-Studio copied to clipboard
Enjoy the magic of Diffusion models!
有加入z-image模型的计划吗 可以说一说估计的时间吗
Thanks for the great work ! I followed the Qwen-Image inference example in below link (recommended for faster inference) for comparing FP8 vs bfloat16: [./accelerate/Qwen-Image-FP8.py](https://github.com/modelscope/DiffSynth-Studio/blob/833ba1e1fa5d36826d5b21d6628baa17ac1e6f4d/examples/qwen_image/accelerate/Qwen-Image-FP8.py) But for FP8 (torch.float8_e4m3fn), the...
precalculating the text encoder embeddings can improve vram usage by only loading the text encoders when the dataset needs to be preprocessed, this also can apply to the vae, so...
在使用qwen-image -edit 进行lora微调的时候,csv 和json有区别么?还是都能使用,只要image,prompt,edit_image能对上就可以
使用单机8卡49GA6000,按照repo中的lora去训练,已经启动了 --use_gradient_checkpointing --use_gradient_checkpointing_offload --enable_fp8_training,降低了lora_rank,zero_stage2/3都试过,全都在**accelerator.prepare步骤爆显存** 求各位指导: 1. 是否是deepspeed配置没起作用?配置方式见下图(--config_file) 2. 是否有其他方法能够优化显存?
Can the training of Z-Image-Turbo be included?
按照手册指导的通过运行以下代码可以快速加载 [Wan-AI/Wan2.1-T2V-1.3B](https://www.modelscope.cn/models/Wan-AI/Wan2.1-T2V-1.3B) 模型并进行推理 import torch from diffsynth import save_video from diffsynth.pipelines.wan_video_new import WanVideoPipeline, ModelConfig pipe = WanVideoPipeline.from_pretrained( torch_dtype=torch.bfloat16, device="cuda", model_configs=[ ModelConfig(model_id="Wan-AI/Wan2.1-T2V-1.3B", origin_file_pattern="diffusion_pytorch_model*.safetensors", offload_device="cpu"), ModelConfig(model_id="Wan-AI/Wan2.1-T2V-1.3B", origin_file_pattern="models_t5_umt5-xxl-enc-bf16.pth", offload_device="cpu"), ModelConfig(model_id="Wan-AI/Wan2.1-T2V-1.3B", origin_file_pattern="Wan2.1_VAE.pth", offload_device="cpu"), ],...
Hi Team, Thanks for your work but I am not able to run the 14B model using the step given. I have 8 gpu with 24GB ram. Cmd used: `torchrun...