sameerKgp
sameerKgp
I think there is a bug in the "stablediffusion" function of CrossAttention_Release_NoImages.py. The same latent is being used both for the noise_cond and noise_cond_edit prediction at every step. But these...
I meant that if you just replace the self-attention maps for the first 20 or so steps and use the cross-attention maps from the edit-prompt only, then also it gives...
Thanks for the reply. The cooking_cake video I got from the link provided in 15th issue. The GOT video is src/video_fragment/output.mp4