InvokeAI
InvokeAI copied to clipboard
[bug]: Regression on MPS - SDXL with only Prompt outputs junk
Is there an existing issue for this problem?
- [X] I have searched the existing issues
Operating system
macOS
GPU vendor
Apple Silicon (MPS)
GPU model
No response
GPU VRAM
No response
Version number
f06765dfba736c8b5ca13d319953cc1b8ba1b5f3
Browser
n/a
Python dependencies
No response
What happened
Generation with only a prompt and no other control layers outputs junk.
First bad commit: f06765dfba736c8b5ca13d319953cc1b8ba1b5f3 is the first bad commit
commit f06765dfba736c8b5ca13d319953cc1b8ba1b5f3 (HEAD)
Author: Ryan Dick <[email protected]>
Date: Mon Sep 30 22:36:25 2024 +0000
Get alternative GGUF implementation working... barely.
invokeai/backend/model_manager/load/model_loaders/flux.py | 3 +--
invokeai/backend/quantization/gguf/ggml_tensor.py | 78 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------
invokeai/backend/quantization/gguf/loaders.py | 114 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------------
pyproject.toml | 4 ++--
4 files changed, 124 insertions(+), 75 deletions(-)
User reports downgrading torch and torchvision resolves the issue.
More details in discord thread: https://discord.com/channels/1020123559063990373/1292396911852126228
What you expected to happen
it works
How to reproduce the problem
No response
Additional context
No response
Discord username
i3oc9i
More reports from discord: https://discord.com/channels/1020123559063990373/1149510134058471514/1292877390640320523
Seems to be an issues with the default attention type and PyTorch 2.4.1. It can be worked around by changing the attention type in invoke.yaml to torch-sdp
attention_type: torch-sdp
or by upgrading torch to a torch nightly and 2.5.0 test version (or downgrading to 2.3.1)
Test results on my Apple M3:
SD1.5, 1024x1024
- torch 2.2.2 sliced: works
- torch 2.4.1 sliced: produces noise
- torch 2.4.1 non-sliced: maxes out memory and is extremely slow - did not run to completion
- torch 2.6.0.dev20241008 (nightly) sliced: produces noise
- torch 2.6.0.dev20241008 (nightly) non-sliced:
RuntimeError: Invalid buffer size: 16.00 GB
SD1.5, 512x512
- torch 2.6.0.dev20241008 (nightly) sliced: produces noise
- torch 2.6.0.dev20241008 (nightly) non-sliced: works
- torch 2.4.1 sliced: produces noise
- torch 2.4.1 non-sliced: works
SDXL, 1024x1024
- torch 2.4.1 sliced: noise
- torch 2.4.1 non-sliced: works
The torch nightly version did not solve the issue for me. I have put up a PR with a workaround: #7066. I see this as a temporary solution that we'll want to re-visit at some point.
The best solution would be to simply get sliced attention working properly on MPS with the latest torch. Might make sense to do this when we push https://github.com/invoke-ai/InvokeAI/pull/6550 across the finish line.