InvokeAI
InvokeAI copied to clipboard
[bug] MacOs: diffusers model, Image 768x768, failed assertion NDArray > 2**32
Is there an existing issue for this?
- [X] I have searched the existing issues
OS
macOS
GPU
mps
VRAM
128GB
What happened?
@keturn, when I'm using a diffuser model (deliberate) I get a failed assertion when image is 768x768.
NOTE: Before the update to diffuser 0.12.1 and transformers=4.26.0 I was able to generate images of this size and larger.
I also tried to !covert_model again, but it fails too
my local repos is updated to c18db4e47b10cf1658612f3eec2d537a789b10ea and my .venv updated by python -mpip install -r requirements.txt
Screenshots
(midjourney) invoke> !switch d-deliberate
Current VRAM usage: 0.00G Offloading midjourney to CPU Loading diffusers model from /users/ivano/Junk/SD/diffusers/deliberate-v1.1 | Using more accurate float32 precision | Default image dimensions = 512 x 512 Model loaded in 1.07s Textual inversions available: Style-GlassFinal, Style-Princess Setting Sampler to k_lms (LMSDiscreteScheduler) (d-deliberate) invoke> a nice dog in the garden -H 768 -W 768 objc[26679]: Class CaptureDelegate is implemented in both /Users/ivano/Code/Ai/invokeai/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x13bc76480) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_videoio.4.7.0.dylib (0x369c78880). One of the two will be used. Which one is undefined. objc[26679]: Class CVWindow is implemented in both /Users/ivano/Code/Ai/invokeai/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x13bc764d0) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_highgui.4.7.0.dylib (0x31f6e0b10). One of the two will be used. Which one is undefined. objc[26679]: Class CVView is implemented in both /Users/ivano/Code/Ai/invokeai/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x13bc764f8) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_highgui.4.7.0.dylib (0x31f6e0b38). One of the two will be used. Which one is undefined. objc[26679]: Class CVSlider is implemented in both /Users/ivano/Code/Ai/invokeai/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x13bc76520) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_highgui.4.7.0.dylib (0x31f6e0b60). One of the two will be used. Which one is undefined. Patchmatch initialized Generating: 0%| | 0/1 [00:00<?, ?it/s]/Users/ivano/Code/Ai/invokeai/.venv/lib/python3.10/site-packages/diffusers/schedulers/scheduling_lms_discrete.py:268: UserWarning: The operator 'aten::nonzero' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.) step_indices = [(schedule_timesteps == t).nonzero().item() for t in timesteps] /AppleInternal/Library/BuildRoots/c651a45f-806e-11ed-a221-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:724: failed assertion `[MPSNDArray initWithDevice:descriptor:] Error: total bytes of NDArray > 2**32' | 0/50 [00:00<?, ?it/s] zsh: abort python ./scripts/invoke.py /opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '
Additional context
Only happen with diffuser model, when I'm using a ckpt model, then I can generate image up to 960x960
Contact Details
No response
I think this is the same error that usually happens on square images when the dimensions are a power of 2. 768 is not a power of 2, though. It's 3 * 2^8.
@whosawhatsis
I think this is the same error that usually happens on square images when the dimensions are a power of 2. 768 is not a power of 2, though. It's 3 * 2^8.
It was working before I updated my local repos with last commit and pip install, also it is not related with square images, 768x704 is failing too.. as well 768x832, 768x960, 640x832
All those dimensions were working with the 0.11 version of the diffuser, see next comment
I reverted the .venv to diffuser==0.11 and transformers==4.25 on last commit, and 768x768, 960x960 works !
I believe thee is a regression with last version of diffuser/transformers on the MacOs environment
diff --git a/environments-and-requirements/requirements-base.txt b/environments-and-requirements/requirements-base.txt
index b7a3a2a7..0c791e1d 100644
--- a/environments-and-requirements/requirements-base.txt
+++ b/environments-and-requirements/requirements-base.txt
@@ -2,7 +2,7 @@
accelerate
albumentations
datasets
-diffusers[torch]~=0.12
+diffusers[torch]==0.11
dnspython==2.2.1
einops
eventlet
@@ -37,7 +37,7 @@ taming-transformers-rom1504
test-tube>=0.7.5
torch-fidelity
torchmetrics
-transformers~=4.26
+transformers==4.25
windows-curses; sys_platform == 'win32'
https://github.com/Birch-san/k-diffusion/archive/refs/heads/mps.zip#egg=k-diffusion
https://github.com/invoke-ai/PyPatchMatch/archive/refs/tags/0.1.5.zip#egg=pypatchmatch
@keturn This continues to be a problem on MacOS MPS systems. Is this a known issue with diffusers 0.12.1
?
Not as far as I know. I don't find any reports of "total bytes of NDArray" in the upstream bug tracker.
the AttnProcessor stuff was only added in 0.12, so downgrading isn't an option without breaking .swap()
@keturn i found this, not sure if it's relevant though https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/5796
I copy here this comment form InvokeAI Version 2.3.0 discussion in order to preserv relevant information about this issue It is not in purpose to pinning anyone (reference https://github.com/invoke-ai/InvokeAI/discussions/2482#discussioncomment-4887572)
here is a log that expose what I'm observig
I have a local branch where I have downgraded the transformer and the diffuser how you can see in the git show. Here is the pip list of my environment venv-pip-list-patched.txt
>>> git show
commit 4379b444104620cf0bca212ae5e518d98df0a9ea (HEAD -> my-fixes, tag: good)
Author: Ivano Coltellacci <[email protected]>
Date: Sat Jan 28 12:39:43 2023 +0100
fix: rollback to previous diffuseur version
diff --git a/environments-and-requirements/requirements-base.txt b/environments-and-requirements/requirements-base.txt
index b7a3a2a7..0c791e1d 100644
--- a/environments-and-requirements/requirements-base.txt
+++ b/environments-and-requirements/requirements-base.txt
@@ -2,7 +2,7 @@
accelerate
albumentations
datasets
-diffusers[torch]~=0.12
+diffusers[torch]==0.11
dnspython==2.2.1
einops
eventlet
@@ -37,7 +37,7 @@ taming-transformers-rom1504
test-tube>=0.7.5
torch-fidelity
torchmetrics
-transformers~=4.26
+transformers==4.25
windows-curses; sys_platform == 'win32'
https://github.com/Birch-san/k-diffusion/archive/refs/heads/mps.zip#egg=k-diffusion
https://github.com/invoke-ai/PyPatchMatch/archive/refs/tags/0.1.5.zip#egg=pypatchmatch
In the following I optimze the analog.ckpt model as diffuser model, than I generate a 960x960 image
>>> python ./scripts/invoke.py
* Initializing, be patient...
>> Initialization file /Users/ivano/Code/Ai/invokeai.stable/invokeai.init found. Loading...
>> Internet connectivity is True
>> InvokeAI, version 2.3.0+a0
>> InvokeAI runtime directory is "/Users/ivano/Code/Ai/invokeai.stable"
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type mps
>> Current VRAM usage: 0.00G
>> Loading midjourney from /Users/ivano/Junk/SD/midjourney-v4.ckpt
>> Scanning Model: midjourney
>> Model scanned ok!
>> Loading midjourney from /Users/ivano/Junk/SD/midjourney-v4.ckpt
| Forcing garbage collection prior to loading new model
| LatentDiffusion: Running in eps-prediction mode
| DiffusionWrapper has 859.52 M params.
| Making attention of type 'vanilla' with 512 in_channels
| Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
| Making attention of type 'vanilla' with 512 in_channels
| Using more accurate float32 precision
| Loading VAE weights from: /Users/ivano/Code/Ai/invokeai.stable/models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
>> Model loaded in 4.94s
>> Model loaded in 5.03s
>> Textual inversions available: Style-Princess
>> Setting Sampler to k_lms
* Initialization done! Awaiting your command (-h for help, 'q' to quit)
(midjourney) invoke> !models
analog not loaded ckpt Analog Diffusion v1 [photo]
clarity not loaded ckpt Clarity [photo]
kalista not loaded ckpt Kalista [general]
midjourney active ckpt Midjourney v4 [general]
sd-15 not loaded ckpt Stable Diffusion version 1.5
sd-inpaint-15 not loaded ckpt Stable Diffusion version 1.5 (inpainting)
vintedois not loaded ckpt Vintedois v0.1 (estilovintedois) [general]
(midjourney) invoke>
(midjourney) invoke> !optimize analog
>> Optimizing analog (30-60s)
global_step key not found in model
>> Success. Optimized model is now located at /Users/ivano/Code/Ai/invokeai.stable/models/converted-ckpts/analog
>> Writing new config file entry for analog
>> vae-ft-mse-840000-ema-pruned VAE corresponds to known stabilityai/sd-vae-ft-mse diffusers version
>> Conversion succeeded
Load optimized model analog? [y]
>> Current VRAM usage: 0.00G
>> Offloading midjourney to CPU
>> Loading diffusers model from /Users/ivano/Code/Ai/invokeai.stable/models/converted-ckpts/analog
| Using more accurate float32 precision
| Loading diffusers VAE from stabilityai/sd-vae-ft-mse
| Using more accurate float32 precision
Downloading (…)_model.safetensors";: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 335M/335M [00:10<00:00, 30.7MB/s]
Downloading (…)lve/main/config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 547/547 [00:00<00:00, 458kB/s]
| Calculating sha256 hash of model files
| sha256 = 7057230c8aaf0b50183cd43d7260f696f8c8b0524d2cff3d02cf24004fb57080 (15 files hashed in 7.54s)
| Default image dimensions = 512 x 512
>> Model loaded in 21.53s
>> Textual inversions available: Style-Princess
>> Setting Sampler to k_lms (LMSDiscreteScheduler)
Delete the original .ckpt file at (/Users/ivano/Junk/SD/analog-v1.safetensors ? [n]
(analog) invoke> an happy dog in a garden -H 960 -W 960
objc[7184]: Class CaptureDelegate is implemented in both /Users/ivano/Code/Ai/invokeai.stable/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x15527a4d0) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_videoio.4.7.0.dylib (0x30b6bc880). One of the two will be used. Which one is undefined.
objc[7184]: Class CVWindow is implemented in both /Users/ivano/Code/Ai/invokeai.stable/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x15527a520) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_highgui.4.7.0.dylib (0x2efe4cb10). One of the two will be used. Which one is undefined.
objc[7184]: Class CVView is implemented in both /Users/ivano/Code/Ai/invokeai.stable/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x15527a548) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_highgui.4.7.0.dylib (0x2efe4cb38). One of the two will be used. Which one is undefined.
objc[7184]: Class CVSlider is implemented in both /Users/ivano/Code/Ai/invokeai.stable/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x15527a570) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_highgui.4.7.0.dylib (0x2efe4cb60). One of the two will be used. Which one is undefined.
>> Patchmatch initialized
Generating: 0%| | 0/1 [00:00<?, ?it/s]/Users/ivano/Code/Ai/invokeai.stable/.venv/lib/python3.10/site-packages/diffusers/schedulers/scheduling_lms_discrete.py:268: UserWarning: The operator 'aten::nonzero' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)
step_indices = [(schedule_timesteps == t).nonzero().item() for t in timesteps]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 50/50 [02:12<00:00, 2.66s/it]
Generating: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [02:13<00:00, 134.00s/it]
>> Usage stats:
>> 1 image(s) generated in 134.24s
Outputs:
[68] /Users/ivano/Code/Ai/@Stuffs/images/@invokeai/001244.1087789743.png: an happy dog in a garden -s 50 -S 1087789743 -W 960 -H 960 -C 7.5 -A k_lms
(analog) invoke>
I guess this demostrate that 0.11 version of the diffuser was working with size greater than 768x768
NOTE SD2.1 breaks at 832x832
I have updated my venv
to the diffuser 0.13 and the issue is confirmed also with this new version
I have updated my venv
to the diffuser 0.13.1 and the issue is confirmed also with this new version
The issue is still being experienced with the release 2.3.1
i note you're using deliberate
and midjourney
. do you get the same problem with the base SD 1.5 or 2.1 diffusers models?
Yes, in the log I switched to the stock sd-15 diffuser model, and I request a 832x704 (a non square image)..
Please note that with version 0.11 of diffuser I was able to create images up to 960x960 pixel. IMHO something was broken with 0.12 version of de diffuseurs
* Initializing, be patient...
>> Initialization file /Users/ivano/Code/Ai/@Stuffs/invokeai.models/invokeai.init found. Loading...
>> Internet connectivity is True
>> InvokeAI, version 2.3.1
>> InvokeAI runtime directory is "/Users/ivano/Code/Ai/@Stuffs/invokeai.models"
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type mps
>> xformers not installed
>> NSFW checker is disabled
>> Current VRAM usage: 0.00G
>> Loading midjourney from /Users/ivano/junk/sd/midjourney-v4.ckpt
>> Scanning Model: midjourney
>> Model scanned ok
>> Loading midjourney from /Users/ivano/junk/sd/midjourney-v4.ckpt
| Forcing garbage collection prior to loading new model
| LatentDiffusion: Running in eps-prediction mode
| DiffusionWrapper has 859.52 M params.
| Making attention of type 'vanilla' with 512 in_channels
| Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
| Making attention of type 'vanilla' with 512 in_channels
| Using more accurate float32 precision
| Loading VAE weights from: /Users/ivano/Code/Ai/@Stuffs/invokeai.models/models/ldm/stable-diffusion-v1/vae-ft-mse-840000-ema-pruned.ckpt
>> Model loaded in 4.99s
>> Loading embeddings from /Users/ivano/Code/Ai/@Stuffs/invokeai.models/embeddings
>> Textual inversion triggers:
>> Setting Sampler to k_lms
* --web was specified, starting web server...
* Initializing, be patient...
>> Initialization file /Users/ivano/Code/Ai/@Stuffs/invokeai.models/invokeai.init found. Loading...
>> Started Invoke AI Web Server!
>> Default host address now 127.0.0.1 (localhost). Use --host 0.0.0.0 to bind any address.
>> Point your browser at http://127.0.0.1:9090
>> System config requested
objc[1693]: Class CaptureDelegate is implemented in both /Users/ivano/Code/Ai/@Stuffs/invokeai.models/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x1795824e8) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_videoio.4.7.0.dylib (0x2bd868880). One of the two will be used. Which one is undefined.
objc[1693]: Class CVWindow is implemented in both /Users/ivano/Code/Ai/@Stuffs/invokeai.models/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x179582538) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_highgui.4.7.0.dylib (0x2bd3f0b10). One of the two will be used. Which one is undefined.
objc[1693]: Class CVView is implemented in both /Users/ivano/Code/Ai/@Stuffs/invokeai.models/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x179582560) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_highgui.4.7.0.dylib (0x2bd3f0b38). One of the two will be used. Which one is undefined.
objc[1693]: Class CVSlider is implemented in both /Users/ivano/Code/Ai/@Stuffs/invokeai.models/.venv/lib/python3.10/site-packages/cv2/cv2.abi3.so (0x179582588) and /opt/homebrew/Cellar/opencv/4.7.0_1/lib/libopencv_highgui.4.7.0.dylib (0x2bd3f0b60). One of the two will be used. Which one is undefined.
>> Patchmatch initialized
>> Model change requested: sd-15
>> Current VRAM usage: 0.00G
>> Offloading midjourney to CPU
>> Loading diffusers model from runwayml/stable-diffusion-v1-5
| Using more accurate float32 precision
| Loading diffusers VAE from stabilityai/sd-vae-ft-mse
| Using more accurate float32 precision
Fetching 15 files: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 15/15 [00:00<00:00, 76818.75it/s]
| Default image dimensions = 512 x 512
>> Model loaded in 8.70s
>> Loading embeddings from /Users/ivano/Code/Ai/@Stuffs/invokeai.models/embeddings
>> Textual inversion triggers:
>> Setting Sampler to k_lms (LMSDiscreteScheduler)
>> Image Generation Parameters:
{'prompt': 'an happy dog into a nice garden [((blurry)), duplicate, deformed, cartoon, animated, render\n]', 'iterations': 1, 'steps': 30, 'cfg_scale': 7.5, 'threshold': 0, 'perlin': 0, 'height': 832, 'width': 704, 'sampler_name': 'k_euler_a', 'seed': 715738625, 'progress_images': False, 'progress_latents': True, 'save_intermediates': 5, 'generation_mode': 'txt2img', 'init_mask': '...', 'hires_fix': False, 'seamless': False, 'variation_amount': 0}
>> ESRGAN Parameters: False
>> Facetool Parameters: False
>> Setting Sampler to k_euler_a (EulerAncestralDiscreteScheduler)
Generating: 0%| | 0/1 [00:00<?, ?it/s]/Users/ivano/Code/Ai/@Stuffs/invokeai.models/.venv/lib/python3.10/site-packages/diffusers/schedulers/scheduling_euler_ancestral_discrete.py:299: UserWarning: The operator 'aten::nonzero' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)
step_indices = [(schedule_timesteps == t).nonzero().item() for t in timesteps]
/AppleInternal/Library/BuildRoots/c651a45f-806e-11ed-a221-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:724: failed assertion `[MPSNDArray initWithDevice:descriptor:] Error: total bytes of NDArray > 2**32' | 0/30 [00:00<?, ?it/s]
zsh: abort invokeai --web
/opt/homebrew/Cellar/[email protected]/3.10.10_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
I'm keeping a try with the ongoing Invoke 3.0.0 and the issue is confirmed (commit 86e2cb04)
FYI, I have run
pip install --upgrade --upgrade-strategy eager --use-pep517 -e .
so diffusers version is updated to 0.14.0
Just to understand this happens just on MacOS right?
Does anybody have a reproducible code snippet that includes just diffusers
by any chance? This would help a lot to quickly figure out what going on I think :-)
Also cc @pcuenca
yes, I'm a mac user, keep in mind that things were working with diffuser 0.11 as explained in the comment https://github.com/invoke-ai/InvokeAI/issues/2444#issuecomment-1423869116
I get you can reproduce using the commit of 28th January
Hey @i3oc9i,
I sadly won't have the time to dive deeper into InvokeAI here could you maybe try to reproduce your issue just using diffusers
code? E.g.:
from diffusers import StableDiffusionPipeline
pipe = ....
...
@patrickvonplaten, programming in python is quite far from my knowledge, but I will give a try later....
Anyway, IMHO this is not relevant, indeed if you look at the comments in the issue, at that time (28th January), I demonstrate that using 0.11 instead of 0.12.1 on the same InvokeAi base code, the issue was not firing.
So my assumption there is some kind of code regression between 0.11 and 0.12.1 for MacOS machines
@i3oc9i The repro snippet would allow us to easily test on 0.11 and 0.12.1. I tried to reproduce but couldn't; what I did was run a stable diffusion pipeline and generate a single image at 768x768 - it worked. So there must be something else in your configuration that is triggering the problem. It could be additional images being generated in a batch, an image to image task, a different base model, or something else.
In any case, the NDArray
issue is hopefully being resolved in the upcoming PyTorch 2.0 release. Would it be possible for you to test using the nightly (preview) version of PyTorch: https://pytorch.org/get-started/locally/? Thanks a lot!
For reference, this was my test script, it worked on the latest main
:
from diffusers import StableDiffusionPipeline, DPMSolverMultistepScheduler
import torch
model_id = "lambdalabs/dreambooth-avatar"
device = "mps"
seed = 1337
scheduler = DPMSolverMultistepScheduler.from_pretrained(model_id, subfolder="scheduler")
torch.manual_seed(seed)
sdm = StableDiffusionPipeline.from_pretrained(
model_id,
scheduler=scheduler,
safety_checker=None,
)
sdm = sdm.to(device)
prompt = "Yoda, avatarart style"
images = sdm(prompt, width=768, height=768, num_inference_steps=20).images
for i, image in enumerate(images):
image.save(f"yoda_768_{device}_{seed}_{i}.png")
(This was tested on both Ventura 13.2 and 13.3 beta)
@pcuenca thank you very much for your help with the snippet, I will give it a try this evening when I come back from work
@pcuenca @patrickvonplaten
I executed your snippet in a dedicated python 3.10.10 venv
(pip-list.txt) using respectively diffusers 0.11 and 0.14
with transformers 0.45 and 0.46.1
based on thorch 1.13.1
, and I get the same result for the two cases
- 768x768 = OK
- 768x832 = OK
- 768x896 = FAIL
- 832x832 = FAIL
>>> So we cannot reproduce with this snippet.
After I have upgraded my python 3.10.10 venv
(pip-list-thorch-2.1.0.txt) with thorch 2.1.0.dev20230307
keeping diffusers==0.14
and transformers==0.46.1
pip3 install --pre torch --force-reinstall --index-url https://download.pytorch.org/whl/nightly/cpu
but I get the same result
- 768x768 = OK
- 768x832 = OK
- 768x896 = FAIL
- 832x832 = FAIL
>>> So last PyTorch 2.1 release
does not solve the NDArray
issue, at least in this case.
(Using Mac Studio Ultra 128G Ram and MacOs Ventura 13.2.1)
@pcuenca @patrickvonplaten
Anyway I also recreated an InvokeAi environment (pip-list.txt) using the commit 01866305140aaba07ecfd590477f2ae368e7a91f, and I was able to obtain an image 960x960, downgrading the version of diffusers
and of transformers
diff --git a/environments-and-requirements/requirements-base.txt b/environments-and-requirements/requirements-base.txt
index b7a3a2a7..0c791e1d 100644
--- a/environments-and-requirements/requirements-base.txt
+++ b/environments-and-requirements/requirements-base.txt
@@ -2,7 +2,7 @@
accelerate
albumentations
datasets
-diffusers[torch]~=0.12
+diffusers[torch]==0.11
dnspython==2.2.1
einops
eventlet
@@ -37,7 +37,7 @@ taming-transformers-rom1504
test-tube>=0.7.5
torch-fidelity
torchmetrics
-transformers~=4.26
+transformers==4.25
windows-curses; sys_platform == 'win32'
https://github.com/Birch-san/k-diffusion/archive/refs/heads/mps.zip#egg=k-diffusion
https://github.com/invoke-ai/PyPatchMatch/archive/refs/tags/0.1.5.zip#egg=pypatchmatch
I'm available to do any other tests you think useful to solve this issue.
Regarding latest release (2.3.2) av the mentioned bug fix:
Upgraded to latest versions of diffusers, transformers, safetensors and accelerate libraries upstream. We hope that this will fix the assertion NDArray > 2**32 issue that MacOS users have had when generating images larger than 768x768 pixels. Please report back.
Unfortunately, it did not work. At least not for me. (Apple MacBook Pro 64gb running Ventura 13.2.1)
I saw a comment on the a1111 issues suggesting that (what appears to be the same issue) is a bug in how 13.2.x interacts with pytorch. Apparently both 13.1 and 13.3 are unaffected.
latest release (2.3.2) does not solve this issue... @whosawhatsis can you please give us the reference to the comment you are talking about ?
Hmm, looking closer at the thread in question (which is kinda all over the place), I'm less confident that it's the same issue. Here's the comment I was remembering: https://github.com/AUTOMATIC1111/stable-diffusion-webui/discussions/7453#discussioncomment-5285469
I have updated to last commit 27a113d8 and I have updated my venv using the following command
pip install --upgrade --upgrade-strategy eager --use-pep517 -e .
in order to upgrade to last version of torch, diffuser and transformers modules
pip list | grep -e diffuser -e transformer -e torch
clip-anytorch 2.5.2
diffusers 0.14.0
pytorch-lightning 1.7.7
torch 2.0.0
torchmetrics 0.11.4
torchvision 0.15.1
transformers 4.27.1
but the issue in not solved, invoking a 704x768 image with the stock SD-1.5 model fail.
invokeai --web
* Initializing, be patient...
>> Initialization file /Users/ivano/Code/Ai/@Stuffs/invokeai.models/invokeai.init found. Loading...
>> Internet connectivity is True
>> InvokeAI, version 3.0.0+a0
>> InvokeAI runtime directory is "/Users/ivano/Code/Ai/@Stuffs/invokeai.models"
>> GFPGAN Initialized
>> CodeFormer Initialized
>> ESRGAN Initialized
>> Using device_type mps
>> xformers not installed
>> NSFW checker is disabled
>> Current VRAM usage: 0.00G
>> Loading diffusers model from runwayml/stable-diffusion-v1-5
| Using more accurate float32 precision
| Loading diffusers VAE from stabilityai/sd-vae-ft-mse
| Using more accurate float32 precision
Fetching 15 files: 100%| | 15/15 [00:00<00:00, 44119.61it/s]
| Default image dimensions = 512 x 512
>> Loading embeddings from /Users/ivano/Code/Ai/@Stuffs/invokeai.models/embeddings
>> Textual inversion triggers: bad_prompt
>> Model loaded in 4.52s
>> Setting Sampler to k_lms (LMSDiscreteScheduler)
* --web was specified, starting web server...
Loading Python libraries...
* Initializing, be patient...
>> Initialization file /Users/ivano/Code/Ai/@Stuffs/invokeai.models/invokeai.init found. Loading...
>> Started Invoke AI Web Server
>> Default host address now 127.0.0.1 (localhost). Use --host 0.0.0.0 to bind any address.
>> Point your browser at http://127.0.0.1:9090
>> System config requested
>> Patchmatch initialized
>> Image Generation Parameters:
{'prompt': 'am happy dog in a nice garden', 'iterations': 1, 'steps': 30, 'cfg_scale': 7.5, 'threshold': 0, 'perlin': 0, 'height': 768, 'width': 704, 'sampler_name': 'k_euler_a', 'seed': 226339246, 'progress_images': False, 'progress_latents': True, 'save_intermediates': 5, 'generation_mode': 'txt2img', 'init_mask': '...', 'hires_fix': False, 'seamless': False, 'variation_amount': 0}
>> ESRGAN Parameters: False
>> Facetool Parameters: False
>> Setting Sampler to k_euler_a (EulerAncestralDiscreteScheduler)
Generating: 0%|
AppleInternal/Library/BuildRoots/c651a45f-806e-11ed-a221-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:724: failed assertion `[MPSNDArray initWithDevice:descriptor:] Error: total bytes of NDArray > 2**32'
zsh: abort invokeai --web
/opt/homebrew/Cellar/[email protected]/3.10.10_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '
@keturn @lstein @pcuenca **Good News !!! ** This issue is solved by upgrading to Ventura 13.3 and pyTorch 2.0
I'm rolling InvokeAI 3.0.0+a0
on commit 09dfde0b
pip list | grep -e diffuser -e transformer -e torch
clip-anytorch 2.5.2
diffusers 0.14.0
pytorch-lightning 1.7.7
torch 2.0.0
torchmetrics 0.11.4
torchvision 0.15.1
transformers 4.27.3
I was able to generate image up to 1088x1088, after I get still Error: total bytes of NDArray > 2**32'
I will propose to close after other macuser confirm
@keturn @lstein @pcuenca **Good News !!! ** This issue is solved by upgrading to Ventura 13.3 and pyTorch 2.0
I'm rolling
InvokeAI 3.0.0+a0
on commit 09dfde0pip list | grep -e diffuser -e transformer -e torch clip-anytorch 2.5.2 diffusers 0.14.0 pytorch-lightning 1.7.7 torch 2.0.0 torchmetrics 0.11.4 torchvision 0.15.1 transformers 4.27.3
I was able to generate image up to 1088x1088, after I get still Error: total bytes of NDArray > 2**32'
I will propose to close after other macuser confirm
Hoorah! Upgrade to Ventura 13.3 did the trick! Seems to be working with invokeai 2.3.2 also, even without newest pytorch.
pip list | grep -e diffuser -e transformer -e torch
clip-anytorch 2.5.2
diffusers 0.14.0
pytorch-lightning 1.7.7
taming-transformers-rom1504 0.0.6
torch 1.13.1
torch-fidelity 0.3.0
torchdiffeq 0.2.3
torchmetrics 0.11.4
torchsde 0.2.5
torchvision 0.14.1
transformers 4.26.1
Its chugging down memory though, and at 1088x1088 it quickly consumed all 64gb of ram, slowing my mac and generation speed dropped to almost a standstill. It's generating at 248s per iteration ;-) (but holding!)
I had no issues with generating image at 832x832 though, something that was not possible with the previous version of Ventura. I'll keep trying to see what max dimensions will be but will wait until Im to ndeending on my mac for work.
UPDATE 1: 1088x1088 generated in 4317.85s UPDATE 2: Everything above 1088x1088 fails, confirming what @i3oc9i experienced