Question about Qwen-Image-Edit prompt enhancement
Original Qwen-Image repo said before send to Qwen-Image-Edit model, prompt should be polished by https://github.com/QwenLM/Qwen-Image/blob/26a3635f485b0a6d85bdb2546b51fc782986fc0a/src/examples/tools/prompt_utils.py#L91
I wonder whether DiffSynth has implemented this. I do not find this. Current implementation has QwenImageUnit_PromptEmbedder in https://github.com/modelscope/DiffSynth-Studio/blob/0a1c172a00fb2dd76abedd3b066ddbf62bd4a60d/diffsynth/pipelines/qwen_image.py#L511C7-L511C35 but i think this is not the prompt enhancement, right?
@liangbingzhao This is an out-of-model processing step. The project aims to leverage the model's inherent capabilities and keep the entire reasoning process transparent. If prompt refinement is needed, please implement it externally.
If this rewriting works, it seems related to the training method of Qwen-Image-Edit.
Qwen-Image-Edit may have used lengthy prompts during training.
Logically, image editing doesn't need much description; more often, it generates reasonable changes based on a reference image.
Aside from some major changes, minor modifications shouldn't require lengthy prompts.
Many prompts are so vague that humans can't even imagine the scene; it's practically magic.
The reason i open this issue is that finding the performance of Qwen-Image-Edit not the same as the reported on benchmarks. So i wonder whether is the issue of prompt refinement things. But it turns out that the image resize issue, all images should go to the resize function, become 384*384, which is the key to improve the performance. I will close this issue.