diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Attention masks are missing in SD3 to mask out text padding tokens

Open reminisce opened this issue 8 months ago • 2 comments

Describe the bug

In the attention implementation of SD3, attention masks currently are not used. This will result in inconsistent outputs for the different values max_seq_length where padding exists in text tokens as the attention scores of padding tokens are non-zero. This issue has been discussed in https://github.com/huggingface/diffusers/discussions/8628, and is created to track the progress of fixing this problem.

Thanks @sayakpaul for the discussion.

Reproduction

n/a

Logs

No response

System Info

n/a

Who can help?

No response

reminisce avatar Jun 24 '24 05:06 reminisce