InvokeAI
InvokeAI copied to clipboard
[bug]: Regression in 2.3.0 on MPS -- Python crashing and new errors on smaller sizes
Is there an existing issue for this?
- [X] I have searched the existing issues
OS
macOS
GPU
mps
VRAM
64GB
What happened?
I have a new M2 Max (38 core) MacBook Pro running MacOS 13.2. In Invoke 2.2.5, there is a known crashing issue for image generations at 1024x1024 on Apple Silicon systems ("total bytes of NDArray > 2**32"), but otherwise I was able to generate at up to 1536x1536 (I haven't tested larger than this). Invoke 2.3.0 has a significant regression, though, and will not generate anything larger than 704x704 with SD 1.5 or 768x768 with SD-2.1-768. I tested in 2.3.0-rc7, but verified that the final release shows the same behavior.
Using the SD 1.5 model, generations from 512x512 to 704x704 finish correctly and perhaps slightly faster than in Invoke 2.2.5. However, with 2.3.0 the generation sizes 768x768 and 832x832 now cause Python to crash with the NDArray error written into the Terminal:
AppleInternal/Library/BuildRoots/c651a45f-806e-11ed-a221-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:724: failed assertion `[MPSNDArray initWithDevice:descriptor:] Error: total bytes of NDArray > 2**32'
Between 832x832 and 1216x1216, Python instead crashes with a different error written to Terminal. I have not seen anyone report this error before (and note that it replaces the known error previously reported at 1024x1024):
AppleInternal/Library/BuildRoots/c651a45f-806e-11ed-a221-7ef33c48bc85/Library/Caches/com.apple.xbs/Sources/MetalPerformanceShaders/MPSCore/Types/MPSNDArray.mm:705: failed assertion `[MPSNDArray initWithDevice:descriptor:] Error: product of dimension sizes > 2**31'
Further, at 1280x1280 and above (all the way to 2048x2048), Python stops crashing and instead there is a nonfatal error printed to Terminal with no generation. It consists of a long traceback ending in:
File "/Applications/Stable Diffusion/invokeai/.venv/lib/python3.10/site-packages/diffusers/models/cross_attention.py", line 189, in get_attention_scores torch.empty(query.shape[0], query.shape[1], key.shape[1], dtype=query.dtype, device=query.device), RuntimeError: Invalid buffer size: 39.06 GB
The reported buffer size depends on the requested image resolution, ranging from the above 39.06 GB at 1280x1280 up to 256.00GB at 2048x2048.
Using the SD 2.1-768 model, the behavior is a little different. This model outputs correctly up to 768x768, but produces the NDArray error from 832x832 to 960x960 and the dimension sizes error from 1024x1024 to 1408x1408. At 1472x1472 and above I get the nonfatal buffer size error, with listed buffer sizes ranging from 42.70 GB to 160.00 GB.
Please help to restore the generation functionality on MPS. Thanks!
Screenshots
No response
Additional context
No response
Contact Details
No response
I also reported this as #2444
as workaround don't use diffuser models for now, instead use .ckpt or .safetensors model
Thanks. I saw that bug report, but you didn’t report the product of dimension sizes crash or the buffer size error.On Feb 11, 2023, at 11:15 AM, Ivano Coltellacci @.***> wrote: I also reported this as #2444 as workaround don't use diffuser models for now
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.***>
@Adreitz can you confirm also this issue ? #2603
Thanks, I’ll check later today when I’m at home.On Feb 11, 2023, at 11:35 AM, Ivano Coltellacci @.***> wrote: @Adreitz can you confirm also this issue ? #2603
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
There has been no activity in this issue for 14 days. If this issue is still being experienced, please reply with an updated confirmation that the issue is still being experienced with the latest release.
The issue is still being experienced with the latest release. see #2444
Yes, I'm still experiencing this issue.
Update: v. 2.3.2-post1 exhibits exactly the same behavior with the SD1.5 model as I described above. I haven't checked SD2.1, but I expect it to also be the same as with 2.3.0.
@i3oc9i I just updated to the MacOS 13.3 public beta (v.4). I am seeing NO MORE CRASHING, including at 1024x1024. However, I am seeing greatly increased memory use. It got a bit better once I updated to the new torch/torchaudio/torchvision releases, but I'm still getting 10.7GB used during a 512x512 generation and over 55GB used at 1024x1024. Can you confirm?
@Adreitz I will prefer to not upgrade to MacOS 13.3 public beta, because I use Mac for work too, and I have other software that may break. Anyway is quite a good news, I will confirm as soon as the official 13.3 will be available, probably next month.
Anyway I updated to the new torch/torchaudio/torchvision releases and that only not solve the issue.
https://github.com/invoke-ai/InvokeAI/issues/2444#issuecomment-1474778828
@Adreitz I confirm issue is solved with official Ventura 13.3, see comment
https://github.com/invoke-ai/InvokeAI/issues/2444#issuecomment-1485891105
@i3oc9i Thanks for the response. Can you confirm increased memory use by Python, or is it just my computer? That has taken over the crashes as the reason that I cannot generate at larger image sizes.
@Adreitz Im not able to be precise about memory usage, because I did non monitor exactly before the update, the general feeling is not, maybe there is some increase but I have 128G so I don't get any memory pressure
Closing as resolved with MacOS 13.3 update. May open a new issue regarding high memory use.