diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

checkpoint_merger community pipeline: add block weighted and module-specific alpha options

Open damian0815 opened this issue 2 years ago • 4 comments

Some enhancements for the checkpoint_merger community pipeline cc @Abhinay1997

  • Different modules can be merged with different alphas via the module_override_alphas kwarg (eg {'unet': 0.2, 'text_encoder': 0.6})
  • Block-weighted merges, whereby a different weight can be used for different layers within the unet via blocks kwarg (eg [0,0,0,0,0,0,0,0,0,0,0,0,0.5,1,1,1,1,1,1,1,1,1,1,1,1]). Specify 12 weights for the down blocks (counting from the input layer), 1 weight for the middle block, and 12 weights for the up blocks (counting from the middle layer). You can find an explanation of block-weighted merging here (cw: waifus), and some weight presets to use here (illustrated here).

damian0815 avatar Feb 19 '23 14:02 damian0815

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@Abhinay1997 re: your idea to make the block weights target class dynamic, i'm not sure that will work - the layer → block weighting index logic is very brittle and as-is only works for the unet (and it may stop working for SD3.0). eg : https://github.com/huggingface/diffusers/blob/01eff5b282f5cba54fcb0b799b94ae162ad69acb/examples/community/checkpoint_merger.py#L25

damian0815 avatar Feb 19 '23 15:02 damian0815

@damian0815, Yes what you said makes sense. For now, I think this is a good feature to be added to Checkpoint Merger. We'll have to figure out fixes to the block_weight dict lookup in case of mismatch with later diffusion models.

Gently pining @patrickvonplaten to review and add your thoughts.

Abhinay1997 avatar Feb 19 '23 15:02 Abhinay1997

@pcuenca gently pinging you to add your thoughts on this

Abhinay1997 avatar Feb 21 '23 07:02 Abhinay1997

This looks good to me! Happy to go ahead with the PR if you like (cc @damian0815 and @Abhinay1997 )

patrickvonplaten avatar Mar 03 '23 18:03 patrickvonplaten

@damian0815 I'm good with the changes. Can you move this from draft to a final PR ?

@patrickvonplaten, can we have multiple contributors in the README.md for a community pipeline ? This is a big change and I think it's only fair that damian0815 is listed as a contributor to CheckpointMergerPipeline.

Abhinay1997 avatar Mar 04 '23 05:03 Abhinay1997

@damian0815 I'm good with the changes. Can you move this from draft to a final PR ?

@patrickvonplaten, can we have multiple contributors in the README.md for a community pipeline ? This is a big change and I think it's only fair that damian0815 is listed as a contributor to CheckpointMergerPipeline.

Sure, feel free to change :-)

patrickvonplaten avatar Mar 07 '23 11:03 patrickvonplaten

@damian0815,

Let me know if good to merge for you

patrickvonplaten avatar Mar 07 '23 11:03 patrickvonplaten

ahh @patrickvonplaten no not yet, there's some bug fixes in my other lib grate (https://github.com/damian0815/grate/blob/main/src/sdgrate/checkpoint_merger_mbw.py) that i need to find some time to copy over and test

damian0815 avatar Mar 09 '23 13:03 damian0815

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Apr 02 '23 15:04 github-actions[bot]

@damian0815, do let me know if there’s anything I can help on incase you have time constraints :)

Abhinay1997 avatar Apr 10 '23 06:04 Abhinay1997

@damian0815, do let me know if there’s anything I can help on incase you have time constraints :)

uhh, yes actually - you could port the changes from https://github.com/damian0815/grate/blob/main/src/sdgrate/checkpoint_merger_mbw.py here (actually it should be enough to just copy/paste the whole file)

damian0815 avatar Apr 11 '23 09:04 damian0815

however FYI snapshot_download is an ill-advised way of downloading models, because it will download eeeeeverything - if the model has .safetensors and .bin files you'll end up downloading 9GB of data for a 4.5GB model. and depending on if/how there's an fp16 revision of the model that might get downloaded too - 11GB of data for a 4.5GB model, potentially more. The snapshot_download should really be replaced with a call to StableDiffusionPipeline.from_pretrained(...) and then read off the individual components eg pipeline.unet, pipeline.text_encoder etc rather than attempting to load the files by hand from disk.

damian0815 avatar Apr 11 '23 09:04 damian0815

@damian0815 BTW we have a download function now as well: https://huggingface.co/docs/diffusers/main/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.download

patrickvonplaten avatar Apr 12 '23 12:04 patrickvonplaten

Hi @damian0815 ! Thanks for the inputs. I'll add your changes over the weekend and test. As for the snapshot_download, it was a tradeoff at that time to not load all the pipelines to memory at once.

However given that we have DiffusionPipeline.download now, things are a lot easier. Let me make the changes after I port your changes here. Thanks for suggesting this @patrickvonplaten :)

Can't do it till the weekend though. 🥲

Abhinay1997 avatar Apr 12 '23 13:04 Abhinay1997

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar May 06 '23 15:05 github-actions[bot]

=[

bghira avatar May 18 '23 23:05 bghira

Hi! is there an update to this issue? I would like to use the checkpoint merger pipeline for a project but can't seem to find if it has been completed or not?

HamnaAkram avatar Sep 08 '23 05:09 HamnaAkram

@HamnaAkram you can find the checkpoint merger pipeline in the community pipelines. This issue was raised for an enhancement to the original.

Abhinay1997 avatar Sep 08 '23 05:09 Abhinay1997

Thanks for pointing me in the right direction but I have noticed an issue with the existing pipeline. When I initialize alpha=0, the merged model and the original first model should be identical and if alpha=1 merged model should be equivalent to second model. However, this is not the case. After merging the models at alpha=0 or 1 they become nonidentical. I would love to hear any feedback regarding that.

HamnaAkram avatar Sep 13 '23 05:09 HamnaAkram

@HamnaAkram that's weird. Can you open a new issue (helps others to find it in the future.) with a code sample ? Please include any output logs and the checkpoint names or model_index.json files if possible.

I'll try to reproduce it.

Abhinay1997 avatar Sep 13 '23 05:09 Abhinay1997