DiffSynth-Studio issues

DiffSynth-Studio/Qwen-Image-EliGen-V2

4

Hi, could you clarify the required format for --dataset_metadata_path and how it matches with --data_file_keys? Should the JSON file be a list of objects with keys like "image" and "eligen_entity_masks"...

Amark-cheey

[Feature] Ascend NPU native support for wanvideo

EndersOwner

Qwen-Image-Edit-2509

2

examples/qwen_image/model_inference/Qwen-Image-Edit-2509.py from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig from PIL import Image import torch pipe = QwenImagePipeline.from_pretrained( torch_dtype=torch.bfloat16, device="cuda", model_configs=[ ModelConfig(model_id="Qwen/Qwen-Image-Edit-2509", origin_file_pattern="transformer/diffusion_pytorch_model*.safetensors"), ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="text_encoder/model*.safetensors"), ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="vae/diffusion_pytorch_model.safetensors"), ], processor_config=ModelConfig(model_id="Qwen/Qwen-Image-Edit", origin_file_pattern="processor/"), ) image_1 =...

Amark-cheey

Error: AttributeError: 'list' object has no attribute 'size' in example code of Qwen-Image-Edit

2

Key Code： edit_image = [Image.open("image1.jpg"), Image.open("image2.jpg")] image_3 = pipe(prompt, edit_image=edit_image, seed=1, num_inference_steps=40, height=1328, width=1024, edit_image_auto_resize=True) Comment： It looks like 'List' is not match the argument 'edit_image' of pipe. Error Details：...

charleybin

why drop last 4 frames in wan-animate training?

3

https://github.com/modelscope/DiffSynth-Studio/blob/main/diffsynth/pipelines/wan_video_new.py#L1073

parryppp

本repo的特点咨询

请问这套系统和diffusers的关系是什么呢？比起用diffusers的pipeline直接训练，有什么优点呢？

yonghenglh6

训练的时候报错 the size of tensor a (4) must match the size of tensor b (5) at non-singleton dimension 2

num_frames % 4 != 1. We round it up to 17. num_frames % 4 != 1. We round it up to 17. num_frames % 4 != 1. We round it...

sugelamyd123

TypeError: WanS2VModel.patchify() takes 2 positional arguments but 3 were given

``` pipe( prompt=prompt, input_image=input_image, negative_prompt=negative_prompt, seed=0, num_frames=num_frames, height=height, width=width, audio_sample_rate=sample_rate, input_audio=input_audio, num_inference_steps=40, sliding_window_size=48, ### 去掉这两处就不会有报错 sliding_window_stride=24, ### ) ``` 使用 s2v 模型推理时，如果添加 sliding_window_size 和 liding_window_stride 这两个参数，会有报错 ``` Traceback (most recent...

yqxd

wan2.2-14b 如何用多卡推理？

2张H20，单卡可以跑推理，但用`accelerate launch --multi_gpu --num_processes 2 examples/wanvideo/model_inference/Wan2.2-I2V-A14B.py` 跑推理会oom，且`nvidia-smi`第二张卡一直没看到占用，这个是正确的启动单卡推理的方式吗？

ruolinsss

wan2.2-vace模型decoder部分会oom

运行其他wan2.2相关模型都可以在H20上顺利推理，但vace模型会在运行完50步diffusion后oom，调小resolution和frame number并没有帮助

ruolinsss

DiffSynth-Studio
DiffSynth-Studio copied to clipboard

Metadata

DiffSynth-Studio/Qwen-Image-EliGen-V2

[Feature] Ascend NPU native support for wanvideo

Qwen-Image-Edit-2509

Error: AttributeError: 'list' object has no attribute 'size' in example code of Qwen-Image-Edit

why drop last 4 frames in wan-animate training?

本repo的特点咨询

训练的时候报错 the size of tensor a (4) must match the size of tensor b (5) at non-singleton dimension 2

TypeError: WanS2VModel.patchify() takes 2 positional arguments but 3 were given

wan2.2-14b 如何用多卡推理？

wan2.2-vace模型decoder部分会oom

← Metadata

Owner

Metadata

DiffSynth-Studio DiffSynth-Studio copied to clipboard

Metadata

← Metadata

Owner

Metadata

DiffSynth-Studio
DiffSynth-Studio copied to clipboard