diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models

Open rootonchair opened this issue 1 year ago • 9 comments

Model/Pipeline/Scheduler description

ResAdapter support image generation in arbitrary resolution for comunity models. The result is quite interesting Project page: https://res-adapter.github.io/ Github: https://github.com/bytedance/res-adapter

Open source status

  • [X] The model implementation is available.
  • [X] The model weights are available (Only relevant if addition is not a scheduler).

Provide useful links for the implementation

No response

rootonchair avatar Mar 07 '24 09:03 rootonchair

Hey I would also like to work on this implementation. Have you already tried ResAdapter?

PacificG avatar Mar 07 '24 13:03 PacificG

Unfortunately, I also intented to work on this if possible

rootonchair avatar Mar 07 '24 14:03 rootonchair

Hey, thanks for suggesting this! It indeed has very cool results. I'm not sure if there's anything to implement here, no? The resolution loras can be loaded like normal loras and there's no new modelling code involved AFAICT. It's just the position where the loras apply. Their example code works out of the box with diffusers as well. Maybe a training script would be interesting as they do not plan to release it due to company decisions as mentioned in [this] issue on their repo.

a-r-r-o-w avatar Mar 07 '24 19:03 a-r-r-o-w

@a-r-r-o-w Sure, I think we can wotk on a training script for this model. However, I'm not sure can reproduce the result as it is using LAION-5B

rootonchair avatar Mar 08 '24 03:03 rootonchair

@rootonchair @a-r-r-o-w @PacificG Thanks for your attention.

For inference, we have supported huggingface demo, replicate demo that you can use. We will support ComfyUI.

For training, it actually is easy to reproduce if you want to do it. Here I give some advice:

  1. For dataset, make sure you can choose images from different resolution and buckets. Meanwhile, no matter what dataset, you can also reproduce the results. Because our resadapter do not capture style information from datasets.
  2. For model architecture, Insert lora, and open group norm in resnet.
  3. In training process, when resolution > 512, open group nrom. when resolution <= 512, close group norm. The lora always be trained.
  4. In training process, you can write a probability function to choose different resolution.

If someone is interested in our work. Then you can try it.

Best,

jiaxiangc avatar Mar 12 '24 04:03 jiaxiangc

Thank you for your awesome work @jiaxiangc! I just finished reading the paper and think that the provided details would be enough for me to replicate the results, which I'll hopefully try working on in the next few days. I couldn't find info about why the group norm layers should be frozen for lower resolutions, could you explain? The reasoning mentioned in the paper is for it to adapt to the statistical distribution of feature maps of high res images, but shouldn't the equivalent case be true for low res images too (say 256x256)? Since most, if not all, SD models fail at lower resolutions currently. I will try experimenting to understand more.

I might also bother you with questions and for reviewing the implementation once completed :)

a-r-r-o-w avatar Mar 13 '24 05:03 a-r-r-o-w

@a-r-r-o-w Taking SD1.5 as an example, we initially experimented with Group norm and LoRA training turned on at a resolution of 128 to 1024. We found that at resolutions <512, inserting only LoRA can still achieve good results. Again, because Group norm is essentially mean and variance, it can't fit the resolution information from 128 to 1024 at the same time like LoRA. Therefore, we only consider turning on Group norm training when the resolution is larger than 512.

jiaxiangc avatar Mar 13 '24 17:03 jiaxiangc

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Apr 07 '24 15:04 github-actions[bot]

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar May 03 '24 15:05 github-actions[bot]