sd-forge-layerdiffuse
sd-forge-layerdiffuse copied to clipboard
subtraction in attention sharing mechanism
In the implementation of attention sharing, I noticed there's a stacked temporal attention adapter.
My question is, why did you subtract the modified_hidden_states
with input h
? Could you share some insights behind the rationale of this design? Thanks!
https://github.com/layerdiffusion/sd-forge-layerdiffuse/blob/e4d5060e05c7b4337a3258bb03c4e3ad2f8b15bb/lib_layerdiffusion/attention_sharing.py#L131-L137