ml-stable-diffusion
ml-stable-diffusion copied to clipboard
Image generation not working with controlNet on iPadOS. Out of memory crash
Hi! :)
I'm running into an out-of-memory crash on a physical iPad Pro M2 (2022) when trying to generate an image with a controlNet model.
Specs: iPad Pro 11-inch (4th generation) iPadOS 17.0 Xcode 15.0.0 ml-stable-diffusion 1.0.0
Models used: Stable Diffusion 1.5: https://huggingface.co/runwayml/stable-diffusion-v1-5 ControlNet Canny: https://huggingface.co/lllyasviel/sd-controlnet-canny
Parameters: imageCount = 1 stepCount = 15 scheduler = .dpmSolverMultistepScheduler disableSafety = true reduceMemory = true computeUnits = .cpuAndNeuralEngine
Converted the models using the provided python package with this command:
python -m python_coreml_stable_diffusion.torch2coreml --convert-unet --convert-text-encoder --convert-vae-decoder --convert-safety-checker --model-version "runwayml/stable-diffusion-v1-5" --chunk-unet --unet-support-controlnet --convert-controlnet "lllyasviel/sd-controlnet-canny" --bundle-resources-for-swift-cli -o "/path/to/save/the/converted/model"
Also tried with a model generated with an added --quantize-nbits 6
argument, same result.
The StableDiffusionPipeline
is successfully created, and the out-of-memory crash happens when the generator is already sending progress callbacks.
Without controlNet the generation process is smooth and really fast. The crash happens only on a physical device, on an iPad simulator it successfully generates an image (though takes ages).
The heaviest stuff is happening inside Unet.predictions(from:)
method
P.S. Thanks for such an exciting package! Hoping to get to play around with controlNet soon :)
Have you tried with a conversion that uses both --quantize-nbits 6
and --attention-implementation SPLIT_EINSUM_V2
? Along with --chunk-unet
, that should get you every optimization available for mobile devices.
@jrittvo just tried converting the same models using the same command, but with --attention-implementation SPLIT_EINSUM_V2
, no changes :(
I wonder if anyone had any luck using controlNet on iOS/iPadOS devices. I've tested it on macOS (MochiDiffusion), but with the original
models (the ones you converted by the way 😁), but haven't seen any examples on iOS/iPadOS
You're the first I heard experimenting with ControlNet on iOS. Is the screen cap you posted reading your physical device or from the simulator? Either way, what are the comparable numbers without a ControlNet? I've wondered what kind of memory load a ControlNet adds.
As a side note, the --quantize-nbits also reduces the size of the converted ControlNet model, and on a Mac, you can use 6-bit ControlNet models with 16-bit base models, and I see no change in quality at all using the reduced size ControlNet model. I think for myself, I'm going to convert all my CN models to the smaller size for use in any situation.
The list of target devices says M1 for iPad. (A14 on iPhone.) Maybe that is a hard and fast limit? Seems it would also fail on the simulator, though, if it is truly an accurate simulation set for your specs.
Thanks for the report @SaladDays831! Are you able to share any logs right around time it crashes? Could you also try without --chunk-unet
? After --quantize-nbits 6
, you no longer need to chunk the Unet and it might be causing some issues.
Hi @atiorh! Thanks for your response As you suggested, I tried running it with an un-chunked model and it did.. something 😄 It gets to about step 5/15, the process gets detached from Xcode but the generation keeps running on the iPad. It finishes successfully, but the generated image always ends up being a mess (attached an example). I’ve gotten similar results when using a faulty model, or a model not compatible with the selected computeUnit (cpuAndNeuralEngine in my case)
Just to clarify, on the iPad simulator using the model generated by the command in the issue description I get good results from controlNet.
Which logs exactly could be helpful? No errors get printed to the Xcode console and the Console.app connected to the iPad doesn't seem to have anything of interest at the time of the crash
This may be a red herring, but with the Swift CLI, I think I remember seeing the step counter progress line in Terminal showing that with a CN, the step count it uses was 1/2 of the count I set in my prompt command. So if you prompted 15 steps, you may only be getting 7. Your image looks like more than 7 steps, but maybe try a prompt with 30 steps?
@jrittvo yeah I also noticed this, but for me it actually happens only with img2img without CN. CN and text2img don't halve the step count. I tried increasing the step count anyways and noticed that it still crashes on step ~18-21
Also just tested another model without --chunk-unet
and without --attention-implementation SPLIT_EINSUM_V2
, getting the same behavior as in my previous comment
Deleted an earlier post. Issue was with the app I built. It was still using ml-stable-diffusion 0.4.0. Will try building it again tomorrow with 1.0.0 in Xcode 15 beta and testing in macOS 14 beta.
@jrittvo thanks for your input! I'll test different model combinations/parameters and post the results here once I have smth
Anyone that got controlNet running on iOS/iPadOS - please let me know! 😁
I built Mochi Diffusion with ml-stable-diffusion 1.0.0 and Xcode 15 beta 2 and ran it in macOS 14 beta 2. So far I've only tested without ControlNets. The split-einsum-v1 models, using NE, run when they feel like it. The split-einsum-v2 model, using NE, will not even load successfully. All model types work fine with CPU and GPU. This new build of Mochi has a memory de-allocation bug (it tries to delete the old model twice) when changing models or compute unit that we need to fix. Maybe this also has something to do with the s-e-v2 model not running on NE. I will get a conda environment set up to run the Swift CLI Stable Diffusion pipeline in the next day or two to test without the Mochi variable in the mix.
Have you gotten anywhere with your iPad and the ControlNets? I have a Swift CLI conda environment with ml-stable-diffusion-1.0.0 running in macOS 14 beta 2 with Xcode 15 beta 2 now. The issue I had with the split-einsum-v2 models using cpuAndNeuralEngine with the Mochi app are happening in the Swift CLI. So the bug is not coming from the app. The same bug is present using the split-einsum-v2 models with ml-stable-diffusion-0.4.0 running in macOS 13.5 with Xcode 14.3, but I don't know that the split-einsum-v2 models were ever intended to work in that environment, so I'm going to keep this just about macOS 14, Xcode 15, and ml-stable-diffusion-1.0.0.
Sometime over the weekend I will open an issue on my bug. I believe it is reproducible, and I have a good lead on what it is about that I think will help someone smarter than me to start tracing it.
Hi @jrittvo! Sorry, I was offline for a couple of days
Not really, I switched focus on some other stuff hoping to get some input from the repo maintainers, as it looks like that's an internal issue unrelated to the model type/conversion parameters.
At some point I'll start testing this again but I guess I've already juggled with all the model parameters :(
@atiorh if I can help you debug this issue, please let me know which logs I can provide to give you more context. The Xcode console is not showing any errors.
I got it running using a model with --quantize-nbits 6 --attention-implementation SPLIT_EINSUM_V2
, and no --chunk-unet
, but only with prompt to image.
startingImage
should be nil, otherwise the same out of memory crash occurs. It still uses a lot of memory and sometimes disconnects from the debugger during generation, but at least it works.
Image to image with controlNet still doesn't work on a physical device
Hey @SaladDays831, thanks for your patience! I am just able to get back on this thread. Glad to hear you have ControlNet running (albeit without the startingImage) on the iPad! Providing startingImage
loads the VAEEncoder.mlmodelc
(~70MB of weights + some more for activations) but that should almost not matter when reduceMemory
is enabled as the encoder would be offloaded after the image is encoded once before the diffusion loop starts. Could you please confirm that reduceMemory
is still enabled in your tests?
Hi @atiorh!
I realized that I was missing the VAEEncoder.mlmodelc
, adding the --convert-vae-encoder
to my command converted it successfully, and I'm able to run controlNet with startingImage
now.
If I'm not missing anything, the --convert-vae-encoder
parameter is not listed anywhere in the "Converting Models to Core ML" section of the readme, so it was a lucky guess 😁
Closing the issue as it's resolved. Also here's the final conversion command used, maybe it'll help somebody like me out :) :
python -m python_coreml_stable_diffusion.torch2coreml --convert-unet --convert-text-encoder --convert-vae-decoder --convert-vae-encoder --convert-safety-checker --model-version "runwayml/stable-diffusion-v1-5" --unet-support-controlnet --quantize-nbits 6 --attention-implementation SPLIT_EINSUM_V2 --convert-controlnet "lllyasviel/sd-controlnet-canny" --bundle-resources-for-swift-cli -o "/path/to/save/the/model"
Thanks @SaladDays831 , great to hear this is working now! Feedback received, adding --convert-vae-encoder
to be more visible in the README
Hi @SaladDays831
It sounds like you're using the reduceMemory = true
setting and it works fine. Could you let me know where you've placed your model file? Because I currently have my downloaded model file in the "Download" directory. When I set reduceMemory = true
, it will find the textEncoder file deeper as TextEncoder.mlmodelc/coremldata.bin so that it said it can't find the model file and crashed. However, if I set it to false
, it's able to locate the file and load the resource properly.
Hey @TimYao18 I'm running this on iPadOS, so my model files are inside the project directory, and I access them like so:
guard let path = Bundle.main.path(forResource: "SD_1.5_canny_openpose", ofType: nil, inDirectory: "SD_Models") else {
fatalError("Fatal error: failed to find the CoreML models.")
}
let resourceURL = URL(fileURLWithPath: path)
Where SD_Models is a folder in the project root, and SD_1.5_canny_openpose is a folder inside it (with the models)
You can check out the MochiDiffusion repo to see how they do it on macOS