training-free-structured-diffusion-guidance
training-free-structured-diffusion-guidance copied to clipboard
🤗 Unofficial huggingface/diffusers-based implementation of the paper "Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis".
I try to run your code with some experioments, but there still report that has been incorrectly initialized or is incorrectly implemented. Expected {'scheduler', 'tokenizer', 'feature_extractor', 'text_encoder', 'vae', 'safety_checker', 'unet'}...
Traceback (most recent call last): File "test.py", line 7, in image = pipe(prompt, struct_attention="align_seq").images[0] File "/opt/anaconda3/envs/ldm/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/data/vjuicefs_hz_cv_enhance_v1/11103392/code/ldm/training-free-structured-diffusion-guidance/tfsdg/pipelines/tfsdg_pipeline.py", line 413, in __call__ self.unet.to('cuda')...
I would advise anyone against using this implementation until these issues are fixed. In the function for sequence alignment (but the same can be said about `_expand_sequence`), we have: def...
Hi, I quite don't understand where `Eq. (4), (7)` of the paper come in this implementation. Can you point me to it? Thanks
Hi, Thank you very much for this. I've started integrating it into https://github.com/hafriedlander/stable-diffusion-grpcserver A couple of notes: - I needed to change a couple of calls in StructuredCrossAttention to use...