ml-stable-diffusion There appear to be 1 leaked semaphore objects to clean up at shutdown

Can't complete the conversion Models to Core ML

Chip: Apple M2
Memory: 8GB
OS: 13.0.1 (22A400)

pip list
Package                        Version    Editable project location
------------------------------ ---------- ----------------------------------------------------------
accelerate                     0.15.0
certifi                        2022.9.24
charset-normalizer             2.1.1
coremltools                    6.1
diffusers                      0.9.0
filelock                       3.8.0
huggingface-hub                0.11.1
idna                           3.4
importlib-metadata             5.1.0
mpmath                         1.2.1
numpy                          1.23.5
packaging                      21.3
Pillow                         9.3.0
pip                            21.3.1
protobuf                       3.20.3
psutil                         5.9.4
pyparsing                      3.0.9
python-coreml-stable-diffusion 0.1.0      /Users/....
PyYAML                         6.0
regex                          2022.10.31
requests                       2.28.1
scipy                          1.9.3
setuptools                     60.2.0
sympy                          1.11.1
tokenizers                     0.13.2
torch                          1.12.0
tqdm                           4.64.1
transformers                   4.25.1
typing_extensions              4.4.0
urllib3                        1.26.13
wheel                          0.37.1
zipp                           3.11.0


python -m python_coreml_stable_diffusion.torch2coreml --convert-unet --convert-text-encoder --convert-vae-decoder --convert-safety-checker -o packages

!!! macOS 13.1 and newer or iOS/iPadOS 16.2 and newer is required for best performance !!!
INFO:__main__:Initializing StableDiffusionPipeline with CompVis/stable-diffusion-v1-4..
Fetching 16 files: 100%|█████████████████████████████████████████████████████████████████████████████████████████| 16/16 [00:00<00:00, 11636.70it/s]
INFO:__main__:Done.
INFO:__main__:Converting vae_decoder
INFO:__main__:`vae_decoder` already exists at packages/Stable_Diffusion_version_CompVis_stable-diffusion-v1-4_vae_decoder.mlpackage, skipping conversion.
INFO:__main__:Converted vae_decoder
INFO:__main__:Converting unet
INFO:__main__:Attention implementation in effect: AttentionImplementations.SPLIT_EINSUM
INFO:__main__:Sample inputs spec: {'sample': (torch.Size([2, 4, 64, 64]), torch.float32), 'timestep': (torch.Size([2]), torch.float32), 'encoder_hidden_states': (torch.Size([2, 768, 1, 77]), torch.float32)}
INFO:__main__:JIT tracing..
/Users/xxx/xxx/apple/ml-stable-diffusion/venv/lib/python3.9/site-packages/torch/nn/functional.py:2515: UserWarning: __floordiv__ is deprecated, and its behavior will change in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor').
  _verify_batch_size([input.size(0) * input.size(1) // num_groups, num_groups] + list(input.size()[2:]))
/Users/xxx/xxx/apple/ml-stable-diffusion/python_coreml_stable_diffusion/layer_norm.py:61: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  assert inputs.size(1) == self.num_channels
INFO:__main__:Done.
INFO:__main__:Converting unet to CoreML..
WARNING:coremltools:Tuple detected at graph output. This will be flattened in the converted model.
Converting PyTorch Frontend ==> MIL Ops:   0%|                                                                           | 0/7876 [00:00<?, ? ops/s]WARNING:coremltools:Saving value type of int64 into a builtin type of int32, might lose precision!
Converting PyTorch Frontend ==> MIL Ops: 100%|█████████████████████████████████████████████████████████████▉| 7874/7876 [00:01<00:00, 4105.24 ops/s]
Running MIL Common passes: 100%|███████████████████████████████████████████████████████████████████████████████| 39/39 [00:27<00:00,  1.43 passes/s]
Running MIL FP16ComputePrecision pass: 100%|█████████████████████████████████████████████████████████████████████| 1/1 [00:44<00:00, 44.50s/ passes]
Running MIL Clean up passes: 100%|█████████████████████████████████████████████████████████████████████████████| 11/11 [03:00<00:00, 16.40s/ passes]
zsh: killed     python -m python_coreml_stable_diffusion.torch2coreml --convert-unet    -o
/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Dec 02 '22 03:12 oscarnevarezleal

I had the same issue: https://github.com/apple/ml-stable-diffusion/issues/5

Dec 02 '22 10:12 enzyme69

Same thing here for me and in the end I'm missing the safety_checker CoreML model

Dec 02 '22 11:12 felipebaez

Just updated the OS to 13.1 preview, still facing the same error.

Dec 02 '22 15:12 oscarnevarezleal

Same here.

Apple M1 Pro 16 GB RAM macOS 13.0.1 (22A400)

Edit: After some investigation it seems like my Mac ran out of memory. It worked well in a later attempt.

Dec 03 '22 13:12 martinlexow

8 GB will cause run out of memory issue. As suggested by Yasuhito. Best if you can ask a compiled model from someone... or try running again and again with Terminal only when logging in

Dec 05 '22 22:12 enzyme69

Same here.

Apple M1 Pro 16 GB RAM macOS 13.0.1 (22A400)

Edit: After some investigation it seems like my Mac ran out of memory. It worked well in a later attempt.

I have the same RAM memory on my Mac. Did you keep trying until it worked eventually?

Apr 06 '23 11:04 mariapatulea

@mariapatulea never worked 4 me

Apr 07 '23 04:04 oscarnevarezleal

I think this is an issue with tqdm and floating point refs on the progress bar.

I get the same issue and don't have coreml installed.

tqdm    4.65.0

Apr 26 '23 07:04 bensh

Hi there!

Has somebody found any solution to this problem? I'm facing the same issue on M1 chip.

May 23 '23 18:05 Siriz23

I'm facing the same issue on M1 chip. Anyone has solution?

May 29 '23 12:05 tahuuha

Check the solution: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/1890

May 29 '23 12:05 tahuuha

I've got the same problem in stable diffusion V 1.5.1 running on Macbook M2: anaconda3/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

Aug 17 '23 14:08 AlanZhou2022

The line you quoted is just a warning, and does not cause any issues. The most common reason why conversions fail is running out of memory, just like in OP's case, look for a line that says or contains "Killed".

Aug 17 '23 15:08 vzsg

i am using macbook pro ventura m2 chip and facing the same issue

Sep 17 '23 13:09 gamesbykk

Problem solved on my side by downgrading Python to 3.10.13

Oct 07 '23 10:10 frankl1

I got this error with PyTorch mps while running tqdm=4.65.0. I was able to remove it and install 4.66.1 which solved it. Not a RAM issue.

Oct 14 '23 03:10 zhanwenchen

I think it might be RAM related even if package versons help - they may just use memory better. It consistently failed for me and then I closed everything on my Mac that I could and it ran fine without changing versions. 🤷

Oct 20 '23 20:10 YakDriver

I got this error with PyTorch mps while running tqdm=4.65.0. I was able to remove it and install 4.66.1 which solved it. Not a RAM issue.

I agree it's not a RAM issue, I have 96GB of RAM on a custom-built M2 model and I'm getting the error. I can guarantee it has nothing to do with RAM

Oct 22 '23 02:10 chris-heney

+1 with the error. M1 Max 64GB

Nov 07 '23 13:11 42piratas

Getting the same error when training Dreambooth. Did anyone figure out a solution to this?

loc("mps_add"("(mpsFileLoc): /AppleInternal/Library/BuildRoots/a0876c02-1788-11ed-b9c4-96898e02b808/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShadersGraph/mpsgraph/MetalPerformanceShadersGraph/Core/Files/MPSGraphUtilities.mm":219:0)): error: input types 'tensor<1x1280xf16>' and 'tensor<1280xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).
./webui.sh: line 255: 38149 Abort trap: 6           "${python_cmd}" -u "${LAUNCH_SCRIPT}" "$@"
/opt/homebrew/Cellar/[email protected]/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

Nov 15 '23 06:11 mo-foodbit

It's not the same error though. Yours was:

error: input types 'tensor<1x1280xf16>' and 'tensor<1280xf32>' are not broadcast compatible
LLVM ERROR: Failed to infer result type(s).

The warning about the semaphore, just like in the OP (where the real error was zsh: Killed, due to running out of memory), is just a red herring, that gets printed after both successful and failed conversions.

Nov 15 '23 11:11 vzsg

I have the same error on a M3 model with 36GB memory! :(

Dec 19 '23 01:12 mossishahi

Same issue on M3 with 128GB ram

Dec 28 '23 10:12 LukaVerhoeven

@LukaVerhoeven nice config^ 🙂

Jan 02 '24 08:01 julien-c

@LukaVerhoeven nice config^ 🙂

Was hoping on no memory issues with this setup 😒

Jan 10 '24 07:01 LukaVerhoeven

It seems related to device type (Mac mps type). When I move mps type tensor to cpu(), the problem no longer appears.

Jan 17 '24 15:01 zzingae

same error on M3 Max 96GB while trying to run invokeAI, any solution?

Jan 20 '24 17:01 lemonsz15

I think this is an issue with tqdm and floating point refs on the progress bar.

I get the same issue and don't have coreml installed.
tqdm    4.65.0

Removing tqdm solved my issue. Thank you!

Jan 30 '24 14:01 Blenderama

In my opinion because you run it on the docker so that the shm size is so small,you can run df -lh to watch its size, therefore you need create the docker with --shm-size=2G then i successfully run it

Apr 07 '24 13:04 yunshiyu11

Same here on Apple M3 Max 36GB MacBook Pro. Never installed CoreML. Upgrading from tqdm=4.65.0 to 4.66.1 solves the problem.

Apr 17 '24 04:04 chenyangkang

ml-stable-diffusion ml-stable-diffusion copied to clipboard

There appear to be 1 leaked semaphore objects to clean up at shutdown

ml-stable-diffusion
ml-stable-diffusion copied to clipboard