swift-coreml-diffusers icon indicating copy to clipboard operation
swift-coreml-diffusers copied to clipboard

Stable Diffusion 2.1-based models fail to load or generate blank black images

Open manifestbound opened this issue 2 years ago • 1 comments

Hi 👋

I’ve been doing some experiments with the app and, for the most part, it works. However, I’ve noticed that the Diffusion app fails to work with models that are based on Stable Diffusion 2.1.

  • For split_einsum_compiled CoreML models, the app reports that it’s “Preparing the model...” or “Generating...” in the console and never (after waiting hours) finishes or starts generating an image.
  • For original_compiled CoreML models, the app does load the model but it produces images that are completely black; no images are generated.

This behavior is consistent among a variety of SD-2.1 checkpoints. Here are some that I’ve confirmed to have this issue:

Strangely, the stable-diffusion-2-1-base does work so I’m pretty confused and wanted to report this issue because this would heavily restrict users who want to use models beyond SD 1.x.

Even more peculiarly, this behavior isn’t unique to just this app. Other Stable Diffusion apps are exhibiting similar issues (see here)

System Info:

  • Mac mini (M1, 2020)
  • RAM: 16 GB
  • macOS 13.3 (22E252)

manifestbound avatar Apr 09 '23 17:04 manifestbound

Update. After spending time triaging and following some guidance from certain Discords, I am finally able to generate something?

man standing in the middle of a city 9 3319909579

It’s still not quite right compared to the outputs of what the model should be able to accomplish but going from a black image to something involved doing some changes to creating models and a pile of hacks to patch deficiencies in ml-stable-diffusion

  • Applying two changes to the following files: torch2coreml.py and DPMSolverMultistepScheduler.swift. Editing torch2coreml.py to incorporate FP32 Core ML model generation on GPU. And editing DPMSolverMultistepScheduler.swift to use “v-prediction” instead of “epsilon-prediction”.
  • Generating FP32 Core ML models
  • Editing the Diffusion app to use a local copy of a ml-stable-diffusion swift package with the “v-prediction” change

Could someone, more technical than me, post this as a major issue with models (like 2.1) going forward?

manifestbound avatar Apr 10 '23 03:04 manifestbound