diffusers icon indicating copy to clipboard operation
diffusers copied to clipboard

Add single file support for Stable Cascade

Open DN6 opened this issue 1 year ago • 3 comments

What does this PR do?

Adds single file support to the StableCascadeUNet, which allows users to load in checkpoints published in the original format.

The single file loading logic and mappings are defined within the UNet without a Mixin, since the mapping functions have be accessible to the from_single_file method somehow. It relies on fetching the configs from a hosted repo of configs hosted on the diffusers org.

This is a more practical approach, since a single file pipeline checkpoint would be quite large (~34GB) to load. Single File loading with combined pipelines is also not something we support at the moment and it is particularly challenging with Cascade due to the dtype restrictions in the Prior (Stage C) pipeline, which cannot work with float16, while the decoder (Stage B) can.

Fixes # (issue)

Before submitting

  • [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • [ ] Did you read the contributor guideline?
  • [ ] Did you read our philosophy doc (important for complex PRs)?
  • [ ] Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
  • [ ] Did you make sure to update the documentation with your changes? Here are the documentation guidelines, and here are tips on formatting docstrings.
  • [ ] Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag members/contributors who may be interested in your PR.

DN6 avatar Mar 11 '24 07:03 DN6

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

My biggest concern is that with the current design to support single-file checkpoint loading, we are breaking the design of how we support it for ControlNet and VAE massively. Personally, I like this current approach, though!

I think what we can do is move towards a single SingleFileModelMixin and use class attributes to pass in the functions to convert the checkpoint from original format to diffusers. I can do that in a follow up for all the model classes that use from_single_file

DN6 avatar Mar 11 '24 13:03 DN6

I think what we can do is move towards a single SingleFileModelMixin and use class attributes to pass in the functions to convert the checkpoint from original format to diffusers.

I don't fully understand. Could you maybe elaborate this with pseudocode?

sayakpaul avatar Mar 11 '24 13:03 sayakpaul

great stuff. minor nitpick, there are several references to ControlNetModel, i guess since this implementation is copied from there?

vladmandic avatar Mar 13 '24 03:03 vladmandic

The docstring is. https://github.com/huggingface/diffusers/pull/7295 should clean it up.

sayakpaul avatar Mar 13 '24 03:03 sayakpaul