DiffSynth-Studio DiffSynth-Studio/Qwen-Image-EliGen-V2

Hi, could you clarify the required format for --dataset_metadata_path and how it matches with --data_file_keys? Should the JSON file be a list of objects with keys like "image" and "eligen_entity_masks" that directly correspond to the arguments?

--dataset_metadata_path data/example_image_dataset/metadata_eligen.json \

--data_file_keys "image,eligen_entity_masks" \

Sep 22 '25 01:09 Amark-cheey

All example datasets for DiffSynthStudio is placed in DiffSynth-Studio/example_image_dataset. The example for eligen is:

[
    {
        "image": "eligen/image.png",
        "prompt": "A beautiful girl wearing shirt and shorts in the street,  holding a sign 'Entity Control'",
        "eligen_entity_prompts": [
            "A beautiful girl",
            "sign 'Entity Control'",
            "shorts",
            "shirt"
        ],
        "eligen_entity_masks":[
            "eligen/0.png",
            "eligen/1.png",
            "eligen/2.png",
            "eligen/3.png"
        ]
    }
]

Sep 22 '25 02:09 mi804

Hi, thanks for the clarification on the dataset format in the previous discussion. I now have another question: I would like to modify the Regional Attention mechanism in Qwen-Image-EliGen-V2, and I plan to implement this within the DiffSynth framework. Could you please advise which part/module of DiffSynth would be the most appropriate place to make these changes, or suggest a recommended approach?

Sep 25 '25 01:09 Amark-cheey

Please refer to the following: Preprocess: https://github.com/modelscope/DiffSynth-Studio/blob/0d6de58af9269654c3d4ef30de5a12ad1527c826/diffsynth/pipelines/qwen_image.py#L594

Attention Mask Construction: https://github.com/modelscope/DiffSynth-Studio/blob/0d6de58af9269654c3d4ef30de5a12ad1527c826/diffsynth/models/qwen_image_dit.py#L434

Sep 25 '25 06:09 mi804

Thank you so much! I have one more small question. I’d like to add an auxiliary loss — could you tell me where in the code I should modify or integrate it into Eligen’s original loss implementation?

Nov 27 '25 03:11 Amark-cheey