ComfyUI icon indicating copy to clipboard operation
ComfyUI copied to clipboard

MacOS Sonoma 14.4 update breaks GPU acceleration on Apple Silicon

Open francisjervis opened this issue 11 months ago • 54 comments

Steps to reproduce: Install CUI on MacOS 14.3.x. Update to latest version, 14.4. Expected behavior: Generation of usable images Issue: After update, ComfyUI outputs Rothko-esque solid color images (usually cyan or white), unless a ControlNet is used, in which case it outputs a ghostly image of the input on a solid color.

This has been reported to affect both CUI and other SD UI's since the beta version. However, Draw Things appears to function normally.

I have tried updating pytorch to the latest nightly which did not resolve the issue.

Also of note: "generation" time seems much shorter than before the update, tho the output is garbage.

francisjervis avatar Mar 08 '24 05:03 francisjervis

Webui Forge works fine, does not seem slower either. No difference either if I use Torch 2.1.2 or the latest 2.2.1. I am on the latest 14.4 and did not have issues in beta either.

So I decided to install Comfy with the help of Pinokio as I am lazy and Comfy does not have a nice Mac install script.

You are right all it generates is blue solid pictures.

InvokeAi works fine too so not sure why Comfy does not.

l0stl0rd avatar Mar 08 '24 12:03 l0stl0rd

I just did a clean install of everything related to ComfyUI and it's still broken. Didn't imagine MacOS update would be the culprit.

SharmaTushar avatar Mar 08 '24 13:03 SharmaTushar

If you search for "sonoma 14.4 stable diffusion" you will find a number of different reports of issues with the beta. However none of them seems to line up with the blue square output behavior with the release version.

francisjervis avatar Mar 08 '24 17:03 francisjervis

Seeing that InvokeAI works fine and it uses venv, I decided to use it for ComfyUI as well. I've used it with Python3.11 and only for installing ComfyUI requirements and torch. It unfortunately didn't work. Would Miniconda need to be included in this somehow? I'm not sure what steps would need to be followed for that.

SharmaTushar avatar Mar 08 '24 18:03 SharmaTushar

Would Miniconda need to be included in this somehow? I'm not sure what steps would need to be followed for that.

Honestly can't see how that would be relevant, if that fixed anything I would assume it was a version-specific incompatibility between 14.4 and a pytorch nightly or something (which was specified in the requirements.txt you used to build the venv but not user-wide, or something like that). My best guess is that one of the modules used an undocumented approach to optimizing something which was (without release notes) broken by Apple - so this is likely in a dependency like torch, but which one is anyone's guess at this point.

This issue in Automatic1111 might be relevant, but the behavior isn't the same (glitchy, not solid cyan).

francisjervis avatar Mar 08 '24 18:03 francisjervis

Honestly can't see how that would be relevant

That's all I could conjure from my very limited python knowledge.

This https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/5461#discussioncomment-8592076 might be relevant, but the behavior isn't the same (glitchy, not solid cyan).

Switching to different samplers has indeed fixed a part of the issue. The initial generation is glitchy but iterative upscale improves it quite a bit.

SharmaTushar avatar Mar 08 '24 19:03 SharmaTushar

Switching to different samplers has indeed fixed a part of the issue.

Which checkpoint? Using SDXL-Lightning here which requires Euler sampler, the output with others is always garbage.

francisjervis avatar Mar 08 '24 19:03 francisjervis

So far I've only tried using SDXL 1.0 (aamXLAnimeMix_v10) with dpmpp_2m_sde_gpu. The final output still has artefacts; I've been noticing them even before the OS update, though to a lesser extent.

SharmaTushar avatar Mar 08 '24 20:03 SharmaTushar

I just tested with and without the --cpu flag and it's immediately clear this is an mps/GPU acceleration issue. ComfyUI_01527_ ComfyUI_01526_ Workflow should be embedded for reproducibility but I expect this to work on any workflow. Inspired by fixing the same issue with local LLM generation the same way...

francisjervis avatar Mar 09 '24 00:03 francisjervis

I did a fresh install of comfyUI without any modification or custom_node and confirms that the workflows worked before the macOS upgrade stops working. The command line argument --cpu helps but much slower...

criyle avatar Mar 09 '24 03:03 criyle

confirms that --cpu works for MacOS Sonoma 14.4 Device: mps will generate a pure color image on MacOS Sonoma 14.4

powerAmore avatar Mar 11 '24 02:03 powerAmore

After updating Sonoma to 14.4 and some testing, ComfyUI seems to work with only SDE samplers. Also ControlNet nodes not working.

pphoto808 avatar Mar 14 '24 06:03 pphoto808

After updating Sonoma to 14.4 and some testing, ComfyUI seems to work with only SDE samplers. Also ControlNet nodes not working.

Try with the --cpu flag and make sure --force-fp16 is not set. That fixed ControlNet for me.

francisjervis avatar Mar 14 '24 06:03 francisjervis

Confirming that some SDE samplers do work. dmpp_sde and dmpp_sde_gpu are outputting clean results. dmpp2_sde, dmpp2_sde_gpu, dmpp3_sde, and dmpp3_sde_gpu will output, but it's not pretty.

untitledfile01 avatar Mar 14 '24 14:03 untitledfile01

From a very handmade and casual test, it seems to me that it's not a matter of a single sampler or a single scheduler but the combination of both: some combinations work, other don't. My results (M1 8GB):

Work (as expected):

  • dpmpp_sde / sgm_uniform
  • dpmpp_sde / normal
  • euler / ddim_uniform
  • dpm_sde_gpu / sgm_uniform
  • dpm_sde_gpu / karras
  • dpmpp_2m_sde / karras

Don't work (blurry and weird):

  • dpmpp_sde / karras
  • dpmpp_2m / karras
  • euler, euler_a / normal
  • euler, euler_a / sgm_uniform
  • euler / karras
  • euler / simple
  • ddim / normal
  • ddpm / normal
  • ddpm / karras
  • heun / normal

cmgaston avatar Mar 15 '24 10:03 cmgaston

From a very handmade and casual test, it seems to me that it's not a matter of a single sampler or a single scheduler but the combination of both: some combinations work, other don't. My results (M1 8GB):

Work (as expected):

  • dpmpp_sde / sgm_uniform
  • dpmpp_sde / normal
  • euler / ddim_uniform
  • dpm_sde_gpu / sgm_uniform
  • dpm_sde_gpu / karras
  • dpmpp_2m_sde / karras

Don't work (blurry and weird):

  • dpmpp_sde / karras
  • dpmpp_2m / karras
  • euler, euler_a / normal
  • euler, euler_a / sgm_uniform
  • euler / karras
  • euler / simple
  • ddim / normal
  • ddpm / normal
  • ddpm / karras
  • heun / normal

+1

powerAmore avatar Mar 16 '24 06:03 powerAmore

I temporarily downgraded torchvision to version 0.16.2 to resolve the issue. Below is my conda env environment.yml, and I am using Python 3.10.

conda install torchvision=0.16.2 -c pytorch

Host:

MacOS M1 Sonoma 14.4

CharosTao avatar Mar 17 '24 17:03 CharosTao

I temporarily downgraded torchvision to version 0.16.2 to resolve the issue. Below is my conda env environment.yml, and I am using Python 3.10.

This works for me too and solves the blurry images with SD 1.5 models (thanks!). Half of the problem is now gone.

I still have insanely high generation times (over 100s/it) with SDXL models though, as if – in this case too – there’s no GPU acceleration (speed is the same as with che --cpu flag, more or less 10-15 times higher than usual). Unfortunately this is not changed by the torchvision downgrade so something else must also be involved.

FWIW, with SD 1.5 models result was the issue, not speed – at least not in my case.

Edit: of course my issue with SDXL models might be related with my 8GB of ram, but still it didn't happen on macOS 14.3.x. If this is the issue I'll live, it makes sense, so I wonder if anyone with beefier Macs (16GB or more) is experiencing the same.

Edit 2: Forget about the SDXL part, I dared to upgrade my main Mac (M2 Pro 16GB) and speed is fine. I guess it's just a matter of memory after all.

cmgaston avatar Mar 17 '24 19:03 cmgaston

I temporarily downgraded torchvision to version 0.16.2 to resolve the issue. Below is my conda env environment.yml, and I am using Python 3.10.

This works for me too and solves the blurry images with SD 1.5 models (thanks!). Half of the problem is now gone.

Might be useful to add which sampler/scheduler pairs this "fix" worked for, looking at @cmgaston's post.

francisjervis avatar Mar 17 '24 19:03 francisjervis

Might be useful to add which sampler/scheduler pairs this "fix" worked for, looking at @cmgaston's post.

In my case, it works for all the combos I tried

cmgaston avatar Mar 18 '24 06:03 cmgaston

is there any way to get it?

chikovani97 avatar Mar 18 '24 16:03 chikovani97

I'm using an M1 Ultra 64gb and using dmpp_sde_gpu/normal works fine with SDXL and other models. No change in speed or anything.

untitledfile01 avatar Mar 18 '24 17:03 untitledfile01

its already giving me dark solid color results, so I think its already affected

chikovani97 avatar Mar 18 '24 17:03 chikovani97

I'm using an M1 Ultra 64gb and using dmpp_sde_gpu/normal works fine with SDXL and other models. No change in speed or anything.

So you just upgraded? No downgrade of torchvision?

QueryType avatar Mar 19 '24 13:03 QueryType

M1 Macbook Pro, macOS 14.4, with torchvision==0.16.2, most checkpoints work, with a few exceptions. image This took 100 seconds, seems normal for SDXL.

bluevisor avatar Mar 20 '24 03:03 bluevisor

M1 Macbook Pro, macOS 14.4, with torchvision==0.16.2, most checkpoints work, with a few exceptions. image This took 100 seconds, seems normal for SDXL.

So we need to wait for a torchvision fix?

QueryType avatar Mar 25 '24 07:03 QueryType

does LCM sampler work? mine doesn't (on a M2Max 64g) torchvision==0.16.2 doesn't help

trivita avatar Mar 25 '24 09:03 trivita

does LCM sampler work? mine doesn't (on a M2Max 64g) torchvision==0.16.2 doesn't help

It works for me, with all the available schedulers (tried on an M2 Pro 16GB, torchvision==0.16.2, default workflow)

cmgaston avatar Mar 25 '24 10:03 cmgaston

does LCM sampler work? mine doesn't (on a M2Max 64g) torchvision==0.16.2 doesn't help

It works for me, with all the available schedulers (tried on an M2 Pro 16GB, torchvision==0.16.2, default workflow)

sorry, my mistake, LCM still works.

trivita avatar Mar 25 '24 10:03 trivita

I tried to upgrade to macOS 14.4.1 and it is still not working properly.

criyle avatar Mar 25 '24 23:03 criyle