DiffSynth-Studio icon indicating copy to clipboard operation
DiffSynth-Studio copied to clipboard

Enjoy the magic of Diffusion models!

Results 380 DiffSynth-Studio issues
Sort by recently updated
recently updated
newest added

我看到Qwen-image-edit-2509的报告上面支持输入controlnet类型的图达到控制的效果,那我是否可以在您提供的lora脚本上面,把extra_input设置为我们提供的mask以达到训练controlnet+lora的效果? ` accelerate launch examples/qwen_image/model_training/train.py \ --dataset_base_path data/example_image_dataset \ --dataset_metadata_path data/example_image_dataset/metadata_qwen_imgae_edit_multi.json \ --data_file_keys "image,mask_image,edit_image" \ --extra_inputs "mask_image,edit_image" \ --max_pixels 1048576 \ --dataset_repeat 50 \ --model_id_with_origin_paths "Qwen/Qwen-Image-Edit-2509:transformer/diffusion_pytorch_model*.safetensors,Qwen/Qwen-Image:text_encoder/model*.safetensors,Qwen/Qwen-Image:vae/diffusion_pytorch_model.safetensors" \ --learning_rate 1e-4 \...

我采用下面的代码来单次推理Qwen_image: ``` import glob from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig from PIL import Image import torch import os os.environ["PYTHONBREAKPOINT"] = "0" pipe = QwenImagePipeline.from_pretrained( torch_dtype=torch.bfloat16, device="cuda", model_configs=[ ModelConfig(path=['/mnt/dolphinfs/hdd_pool/docker/user/hadoop-automaterials/yongzhao41/insert-anything/checkpoints/Qwen-Image/text_encoder/model-00001-of-00004.safetensors', '/mnt/dolphinfs/hdd_pool/docker/user/hadoop-automaterials/yongzhao41/insert-anything/checkpoints/Qwen-Image/text_encoder/model-00002-of-00004.safetensors', '/mnt/dolphinfs/hdd_pool/docker/user/hadoop-automaterials/yongzhao41/insert-anything/checkpoints/Qwen-Image/text_encoder/model-00003-of-00004.safetensors', '/mnt/dolphinfs/hdd_pool/docker/user/hadoop-automaterials/yongzhao41/insert-anything/checkpoints/Qwen-Image/text_encoder/model-00004-of-00004.safetensors']),...

Hi @Artiprocher , Thanks for your awesome project. I am trying to infer video generation based on control video and reference image inputs with [Wan2.1-Fun-1.3B-Control](https://github.com/modelscope/DiffSynth-Studio/blob/main/examples/wanvideo/model_inference/Wan2.1-Fun-1.3B-Control.py). However, I met the error...

Hi, krea realtime is an autoregressive model with causal attention. Does Diffsynth implement causal attention?

Hi, when i using diffsynth to do inference of wan2.2 animate, if the input video shape is not 1280(h)*720(w) will get the error as blow: ``` File "./DiffSynth-Studio/diffsynth/models/wan_video_animate_adapter.py", line 643,...

I hope you find it useful and share the video. You can do LoRA training and full Fine Tuning with as low as 6 GB GPUs on Windows with resonable...