diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

[SD3 ControlNet] bug in pipeline 'controlnet_pooled_projections'

Open tobiasfshr opened this issue 1 year ago • 4 comments

Describe the bug

Hi,

I think I found an issue that causes a misalignment between training and inference in SD3 ControlNet.

https://github.com/huggingface/diffusers/blob/a3e8d3f7deed140f57a28d82dd0b5d965bd0fb09/src/diffusers/pipelines/controlnet_sd3/pipeline_stable_diffusion_3_controlnet.py#L977

I think the if-else block starting there is not correct. It should be

        if controlnet_pooled_projections is None and pooled_prompt_embeds is None:
            controlnet_pooled_projections = torch.zeros_like(pooled_prompt_embeds)
        elif controlnet_pooled_projections is None:
            controlnet_pooled_projections = pooled_prompt_embeds

Given that in training, the pooled_prompt_embeds are fed to the model: https://github.com/huggingface/diffusers/blob/a3e8d3f7deed140f57a28d82dd0b5d965bd0fb09/examples/controlnet/train_controlnet_sd3.py#L1293

Additionally, I am wondering if this line: https://github.com/huggingface/diffusers/blob/a3e8d3f7deed140f57a28d82dd0b5d965bd0fb09/examples/controlnet/train_controlnet_sd3.py#L1287 Should be aligned with this line: https://github.com/huggingface/diffusers/blob/a3e8d3f7deed140f57a28d82dd0b5d965bd0fb09/examples/controlnet/train_controlnet_sd3.py#L1257 This seems to be the more sensible approach, but will probably not make much difference since the ControlNet can also learn the shift. It might speed up convergence slightly.

Best, Tobias

Reproduction

Train an SD3 ControlNet and during log_validation it will be executed.

Logs

No response

System Info

diffusers==0.30.3

Who can help?

@yiyixuxu @sayakpaul

tobiasfshr avatar Oct 15 '24 16:10 tobiasfshr

I have tried train sd3 controlnet, but it seems the validation results are really bad, and the training loss was oscillating all the time, you can take a look the results at this discussion https://github.com/huggingface/diffusers/discussions/9675

Maybe you have any suggestions to make training sd3 controlnet have better results? thank you!

xduzhangjiayu avatar Oct 16 '24 01:10 xduzhangjiayu

i also find this bug, but when i test https://huggingface.co/alimama-creative/SD3-Controlnet-Inpainting repo, controlnet_pooled_projections = torch.zeros_like(pooled_prompt_embeds) is right.

egbertYeah avatar Oct 18 '24 06:10 egbertYeah

i also find this bug, but when i test https://huggingface.co/alimama-creative/SD3-Controlnet-Inpainting repo, controlnet_pooled_projections = torch.zeros_like(pooled_prompt_embeds) is right.

Could you please describe the bug? Maybe I have same bug like you

xduzhangjiayu avatar Oct 18 '24 10:10 xduzhangjiayu

i also find this bug, but when i test https://huggingface.co/alimama-creative/SD3-Controlnet-Inpainting repo, controlnet_pooled_projections = torch.zeros_like(pooled_prompt_embeds) is right.

Could you please describe the bug? Maybe I have same bug like you

controlnet_pooled_projections variable is different at inference and training time,when training use pooled_prompt_embeds,inference use torch.zeros_like(pooled_prompt_embeds)

egbertYeah avatar Oct 18 '24 10:10 egbertYeah

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Nov 15 '24 15:11 github-actions[bot]

Cc: @yiyixuxu

sayakpaul avatar Nov 16 '24 13:11 sayakpaul

@sayakpaul @yiyixuxu I can make a PR if you approve of the changes (if clause is a clear bug, but the missing shift is debatable).

tobiasfshr avatar Nov 18 '24 08:11 tobiasfshr

Sorry for the delay on our end, @tobiasfshr. The team was on a company-wide vacation. Yiyi will respond to your queries soon.

sayakpaul avatar Nov 18 '24 08:11 sayakpaul

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Dec 12 '24 15:12 github-actions[bot]

Code has changed since, is this still an issue?

https://github.com/huggingface/diffusers/blob/96c376a5ff201a31d676091a59a011c8c29d095b/src/diffusers/pipelines/controlnet_sd3/pipeline_stable_diffusion_3_controlnet.py#L1027-L1031

https://github.com/huggingface/diffusers/blob/96c376a5ff201a31d676091a59a011c8c29d095b/src/diffusers/models/controlnets/controlnet_sd3.py#L64

hlky avatar Dec 12 '24 16:12 hlky

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jan 06 '25 15:01 github-actions[bot]