DiffSynth-Studio icon indicating copy to clipboard operation
DiffSynth-Studio copied to clipboard

Question about Qwen-Image-Edit prompt enhancement

Open liangbingzhao opened this issue 4 months ago • 1 comments

Original Qwen-Image repo said before send to Qwen-Image-Edit model, prompt should be polished by https://github.com/QwenLM/Qwen-Image/blob/26a3635f485b0a6d85bdb2546b51fc782986fc0a/src/examples/tools/prompt_utils.py#L91

I wonder whether DiffSynth has implemented this. I do not find this. Current implementation has QwenImageUnit_PromptEmbedder in https://github.com/modelscope/DiffSynth-Studio/blob/0a1c172a00fb2dd76abedd3b066ddbf62bd4a60d/diffsynth/pipelines/qwen_image.py#L511C7-L511C35 but i think this is not the prompt enhancement, right?

liangbingzhao avatar Oct 18 '25 18:10 liangbingzhao

@liangbingzhao This is an out-of-model processing step. The project aims to leverage the model's inherent capabilities and keep the entire reasoning process transparent. If prompt refinement is needed, please implement it externally.

Artiprocher avatar Oct 20 '25 01:10 Artiprocher

If this rewriting works, it seems related to the training method of Qwen-Image-Edit.

Qwen-Image-Edit may have used lengthy prompts during training.

Logically, image editing doesn't need much description; more often, it generates reasonable changes based on a reference image.

Aside from some major changes, minor modifications shouldn't require lengthy prompts.

Many prompts are so vague that humans can't even imagine the scene; it's practically magic.

Deng-Xian-Sheng avatar Dec 02 '25 09:12 Deng-Xian-Sheng

The reason i open this issue is that finding the performance of Qwen-Image-Edit not the same as the reported on benchmarks. So i wonder whether is the issue of prompt refinement things. But it turns out that the image resize issue, all images should go to the resize function, become 384*384, which is the key to improve the performance. I will close this issue.

liangbingzhao avatar Dec 02 '25 10:12 liangbingzhao