style_aligned_comfy
style_aligned_comfy copied to clipboard
the workflow result is not in same style
i use your style_aligned_inversion.json as my workflow,but it not work.
Same here!
Can you post your workflow as is? Will look into this. I would suggest removing your negative prompt and reducing CFG scale - the text guidance is probably overriding the effect of the node.
Can you post your workflow as is? Will look into this. I would suggest removing your negative prompt and reducing CFG scale - the text guidance is probably overriding the effect of the node.你能按原样发布你的工作流程吗?将对此进行调查。我建议删除您的负面提示并减少 CFG 比例 - 文本指南可能会覆盖节点的效果。
It just not work for me not any close with reference image.
I just used your json file as it was, with different images and text prompts
I'm also having hard time replicating the example results, what am I missing?
It clearly does something, just not as... good as in the examples:
Same here, the "Style aligned reference" node does not work well.
But the batch node works well.
works for me, with the attached settings.
works for me, with the attached settings.
Using ddim sampler with ddim_uniform scheduler for both ddim inversion and for the final image generation? I tried with both sdxl and a sd1.5 model, and it is really not working at all.
You can see in my image, I get pretty close to the original image as far as features are concerned. I get even closer when I use BilboX Prompt and an embedding picker for the negative prompt. I think you just need to play with how you're prompting.
Yes you have much much better results than me. I'll work on prompts
While I haven't fully examined all the code, it seems to me that the StyleAlignedReferenceSampler implementation is incorrect.
What it should be doing for the reference alignment is:
- accept a series of noised latents for the reference image (call it
ref_latents), indexed by timestep - concat the fully noised final-timestep latent (eg.
ref_latents[0]if they are in reverse order) to the front of the initial latent as the first of a batch - after each denoising step t (in the paper / official code they use
callback_on_step_endfor this), replace the latent of the first (reference) image in the batch with the reference latent from the next step, eg.ref_latents[t+1](this keeps it aligned at every step, otherwise it will denoise away from the reference)
At a glance, I don't see that behavior here. Unless I missed something (very possible), you're just using the final latent and not aligning it after every step to the reference steps.
I added my current code in a PR -- it is giving me much better results https://github.com/brianfitzgerald/style_aligned_comfy/pull/8
The robot from the ship is a perfect example of how much better it can become:
I still find ComfyUI's code a bit of a mess to navigate, so this may have some mistakes, but it seems to be following the paper's strategy more closely with the reference noising.
Hopefully others here can try and see if this is working for them.
