diffusers
diffusers copied to clipboard
checkpoint_merger community pipeline: add block weighted and module-specific alpha options
Some enhancements for the checkpoint_merger community pipeline cc @Abhinay1997
- Different modules can be merged with different alphas via the
module_override_alphas
kwarg (eg{'unet': 0.2, 'text_encoder': 0.6}
) - Block-weighted merges, whereby a different weight can be used for different layers within the unet via
blocks
kwarg (eg[0,0,0,0,0,0,0,0,0,0,0,0,0.5,1,1,1,1,1,1,1,1,1,1,1,1]
). Specify 12 weights for the down blocks (counting from the input layer), 1 weight for the middle block, and 12 weights for the up blocks (counting from the middle layer). You can find an explanation of block-weighted merging here (cw: waifus), and some weight presets to use here (illustrated here).
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.
@Abhinay1997 re: your idea to make the block weights target class dynamic, i'm not sure that will work - the layer → block weighting index logic is very brittle and as-is only works for the unet (and it may stop working for SD3.0). eg : https://github.com/huggingface/diffusers/blob/01eff5b282f5cba54fcb0b799b94ae162ad69acb/examples/community/checkpoint_merger.py#L25
@damian0815, Yes what you said makes sense. For now, I think this is a good feature to be added to Checkpoint Merger. We'll have to figure out fixes to the block_weight dict lookup in case of mismatch with later diffusion models.
Gently pining @patrickvonplaten to review and add your thoughts.
@pcuenca gently pinging you to add your thoughts on this
This looks good to me! Happy to go ahead with the PR if you like (cc @damian0815 and @Abhinay1997 )
@damian0815 I'm good with the changes. Can you move this from draft to a final PR ?
@patrickvonplaten, can we have multiple contributors in the README.md for a community pipeline ? This is a big change and I think it's only fair that damian0815 is listed as a contributor to CheckpointMergerPipeline.
@damian0815 I'm good with the changes. Can you move this from draft to a final PR ?
@patrickvonplaten, can we have multiple contributors in the README.md for a community pipeline ? This is a big change and I think it's only fair that damian0815 is listed as a contributor to CheckpointMergerPipeline.
Sure, feel free to change :-)
@damian0815,
Let me know if good to merge for you
ahh @patrickvonplaten no not yet, there's some bug fixes in my other lib grate
(https://github.com/damian0815/grate/blob/main/src/sdgrate/checkpoint_merger_mbw.py) that i need to find some time to copy over and test
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
@damian0815, do let me know if there’s anything I can help on incase you have time constraints :)
@damian0815, do let me know if there’s anything I can help on incase you have time constraints :)
uhh, yes actually - you could port the changes from https://github.com/damian0815/grate/blob/main/src/sdgrate/checkpoint_merger_mbw.py here (actually it should be enough to just copy/paste the whole file)
however FYI snapshot_download
is an ill-advised way of downloading models, because it will download eeeeeverything - if the model has .safetensors
and .bin
files you'll end up downloading 9GB of data for a 4.5GB model. and depending on if/how there's an fp16 revision of the model that might get downloaded too - 11GB of data for a 4.5GB model, potentially more. The snapshot_download
should really be replaced with a call to StableDiffusionPipeline.from_pretrained(...)
and then read off the individual components eg pipeline.unet
, pipeline.text_encoder
etc rather than attempting to load the files by hand from disk.
@damian0815 BTW we have a download
function now as well:
https://huggingface.co/docs/diffusers/main/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.download
Hi @damian0815 ! Thanks for the inputs. I'll add your changes over the weekend and test. As for the snapshot_download
, it was a tradeoff at that time to not load all the pipelines to memory at once.
However given that we have DiffusionPipeline.download
now, things are a lot easier. Let me make the changes after I port your changes here. Thanks for suggesting this @patrickvonplaten :)
Can't do it till the weekend though. 🥲
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
=[
Hi! is there an update to this issue? I would like to use the checkpoint merger pipeline for a project but can't seem to find if it has been completed or not?
@HamnaAkram you can find the checkpoint merger pipeline in the community pipelines. This issue was raised for an enhancement to the original.
Thanks for pointing me in the right direction but I have noticed an issue with the existing pipeline. When I initialize alpha=0, the merged model and the original first model should be identical and if alpha=1 merged model should be equivalent to second model. However, this is not the case. After merging the models at alpha=0 or 1 they become nonidentical. I would love to hear any feedback regarding that.
@HamnaAkram that's weird. Can you open a new issue (helps others to find it in the future.) with a code sample ? Please include any output logs and the checkpoint names or model_index.json files if possible.
I'll try to reproduce it.