automatic [Issue]: FLUX qint load failed

Issue Description

Run SD.Next with --use-cuda --use-xformers --models-dir e:\Models I setup FLUX.1-dev-qint8 [fd65655d4d] using model selection dialog. Set "Model Offloading" to "model", other settings unchanged If i try to run model with any promt i Get next error

20:58:44-116531 ERROR    Exception: local variable 'attn_output' referenced before assignment
20:58:44-118531 ERROR    Arguments: args=('task(385hld2cc59l159)', '', 'mario brothers', '', [], 20, 0, 40, True,
                         False, False, False, 1, 1, 6, 6, 0.7, 0, 0.5, 1, 1, -1.0, -1.0, 0, 0, 0, 1024, 1024, False,
                         0.3, 1, 1, 'Add with forward', 'None', False, 20, 0, 0, 20, 0, '', '', 0, 0, 0, 0, False, 4,
                         0.95, False, 0.6, 1, '#000000', 0, [], 0, 1, False, 'None', 'None', 'None', 'None', 0.5, 0.5,
                         0.5, 0.5, None, None, None, None, False, False, False, False, 0, 0, 0, 0, 1, 1, 1, 1, None,
                         None, None, None, False, '', False, 0, '', [], 0, '', [], 0, '', [], False, True, False, True,
                         False, False, False, False, 0, False, 'None', 2, True, 1, 0, 1, -0.5, 0, '', 0.5, 5, None, '',
                         0.5, 5, None, 3, 1, 1, 0.8, 8, 64, True, 'None', [], 'FaceID Base', True, True, 1, 1, 1, 0.5,
                         True, 'person', 1, 0.5, True, 2, True, 1, 35, True, 1, 0.75, True, 2, 0.75, False, 3, 0.75,
                         False, 4, 0.75, 0.65, True, False, 1, 1, 1, '', True, 0.5, 600.0, 1.0, True, None, 1, 0, 0, 0,
                         0, 0, 0, 0, 1, 1, 1, 0.5, 0.5, 'OpenGVLab/InternVL-14B-224px', False, 0.7, 1.2, 128, False,
                         False, 'positive', 'comma', 0, False, False, '', [], 0.8, 20, 'dpmpp_sde', 'v2', False, True,
                         'v1.1', 'None', '', 1, '', 'None', 1, '7,8,9', 1, 0.01, 0.2, None, '', False, ['attention',
                         'adain_queries', 'adain_keys'], 1, 0, 0, True, 10, 'None', 16, 'None', 1, True, 'None', 2,
                         True, 1, 0, True, 'none', 3, 4, 0.25, 0.25, 'THUDM/CogVideoX-2b', 'DDIM', 49, 6, 'balanced',
                         True, 'None', 8, True, 1, 0, None, None, 45, 'None', 2, True, 1, 0, '0.9.1', '', 'diffusers',
                         True, 41, 'None', 2, True, 1, 0, 45, 'None', 2, True, 1, 0, 'None', True, 0, 'None', 2, True,
                         1, 0, 0, '', [], 0, '', [], 0, '', [], False, True, False, True, False, False, False, False,
                         0, False, 'None', 2, True, 1, 0) kwargs={}
20:58:44-160386 ERROR    gradio call: UnboundLocalError
┌───────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────┐
│ E:\SD.Next\modules\call_queue.py:31 in f                                                                            │
│                                                                                                                     │
│   30 │   │   │   try:                                                                                               │
│ > 31 │   │   │   │   res = func(*args, **kwargs)                                                                    │
│   32 │   │   │   │   progress.record_results(id_task, res)                                                          │
│                                                                                                                     │
│ E:\SD.Next\modules\txt2img.py:93 in txt2img                                                                         │
│                                                                                                                     │
│    92 │   if processed is None:                                                                                     │
│ >  93 │   │   processed = processing.process_images(p)                                                              │
│    94 │   processed = scripts.scripts_txt2img.after(p, processed, *args)                                            │
│                                                                                                                     │
│ E:\SD.Next\modules\processing.py:210 in process_images                                                              │
│                                                                                                                     │
│   209 │   │   │   with context_hypertile_vae(p), context_hypertile_unet(p):                                         │
│ > 210 │   │   │   │   processed = process_images_inner(p)                                                           │
│   211                                                                                                               │
│                                                                                                                     │
│ E:\SD.Next\modules\processing.py:337 in process_images_inner                                                        │
│                                                                                                                     │
│   336 │   │   │   │   │   from modules.processing_diffusers import process_diffusers                                │
│ > 337 │   │   │   │   │   samples = process_diffusers(p)                                                            │
│   338 │   │   │   │   else:                                                                                         │
│                                                                                                                     │
│ E:\SD.Next\modules\processing_diffusers.py:453 in process_diffusers                                                 │
│                                                                                                                     │
│   452 │   if 'base' not in p.skip:                                                                                  │
│ > 453 │   │   output = process_base(p)                                                                              │
│   454 │   else:                                                                                                     │
│                                                                                                                     │
│ E:\SD.Next\modules\processing_diffusers.py:102 in process_base                                                      │
│                                                                                                                     │
│   101 │   │   else:                                                                                                 │
│ > 102 │   │   │   output = shared.sd_model(**base_args)                                                             │
│   103 │   │   if isinstance(output, dict):                                                                          │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\torch\utils\_contextlib.py:116 in decorate_context                                │
│                                                                                                                     │
│   115 │   │   with ctx_factory():                                                                                   │
│ > 116 │   │   │   return func(*args, **kwargs)                                                                      │
│   117                                                                                                               │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\diffusers\pipelines\flux\pipeline_flux.py:889 in __call__                         │
│                                                                                                                     │
│   888 │   │   │   │                                                                                                 │
│ > 889 │   │   │   │   noise_pred = self.transformer(                                                                │
│   890 │   │   │   │   │   hidden_states=latents,                                                                    │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                             │
│                                                                                                                     │
│   1735 │   │   else:                                                                                                │
│ > 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                          │
│   1737                                                                                                              │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl                                     │
│                                                                                                                     │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                      │
│ > 1747 │   │   │   return forward_call(*args, **kwargs)                                                             │
│   1748                                                                                                              │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\diffusers\models\transformers\transformer_flux.py:522 in forward                  │
│                                                                                                                     │
│   521 │   │   │   else:                                                                                             │
│ > 522 │   │   │   │   encoder_hidden_states, hidden_states = block(                                                 │
│   523 │   │   │   │   │   hidden_states=hidden_states,                                                              │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                             │
│                                                                                                                     │
│   1735 │   │   else:                                                                                                │
│ > 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                          │
│   1737                                                                                                              │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl                                     │
│                                                                                                                     │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                      │
│ > 1747 │   │   │   return forward_call(*args, **kwargs)                                                             │
│   1748                                                                                                              │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\diffusers\models\transformers\transformer_flux.py:193 in forward                  │
│                                                                                                                     │
│   192 │   │   # Process attention outputs for the `hidden_states`.                                                  │
│ > 193 │   │   attn_output = gate_msa.unsqueeze(1) * attn_output                                                     │
│   194 │   │   hidden_states = hidden_states + attn_output                                                           │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
UnboundLocalError: local variable 'attn_output' referenced before assignment

Version Platform Description

Python: version=3.10.6 platform=Windows Version: app=sd.next updated=2024-12-24 hash=451eeab1 branch=master url=https://github.com/vladmandic/automatic/tree/master ui=main Platform: arch=AMD64 cpu=Intel64 Family 6 Model 165 Stepping 3, GenuineIntel system=Windows release=Windows-10-10.0.19045-SP0 python=3.10.6 docker=False Extensions: enabled=['Lora', 'sd-extension-chainner', 'sd-extension-system-info', 'sd-webui-agent-scheduler', 'sdnext-modernui', 'stable-diffusion-webui-rembg'] Device detect: memory=12.0 optimization=balanced Engine: backend=Backend.DIFFUSERS compute=cuda device=cuda attention="xFormers" mode=no_grad Torch parameters: backend=cuda device=cuda config=Auto dtype=torch.bfloat16 vae=torch.bfloat16 unet=torch.bfloat16 context=no_grad nohalf=False nohalfvae=False upcast=False deterministic=False test-fp16=True test-bf16=True optimization="xFormers" Device: device=NVIDIA GeForce RTX 3060 n=1 arch=sm_90 capability=(8, 6) cuda=12.4 cudnn=90100 driver=560.94 Torch: torch==2.5.1+cu124 torchvision==0.20.1+cu124 Packages: diffusers==0.33.0.dev0 transformers==4.47.1 accelerate==1.2.1 gradio==3.43.2

Relevant log output

No response

Backend

Diffusers

UI

Standard

Branch

Master

Model

FLUX.1

Acknowledgements

[X] I have read the above and searched for existing issues
[X] I confirm that this is classified correctly and its not an extension issue

Dec 31 '24 18:12 jabacrack

please try to reproduce using latest version, it was just released recently and has some relevant fixes.

Dec 31 '24 18:12 vladmandic

Now it stop working without any error on flux load. And previously debug messages was disabled by default. Happy New Year!

23:57:56-105736 INFO     Starting SD.Next
23:57:56-110872 INFO     Logger: file="e:\SD.Next\sdnext.log" level=DEBUG size=65 mode=create
23:57:56-113872 INFO     Python: version=3.10.6 platform=Windows bin="e:\SD.Next\venv\Scripts\python.exe"
                         venv="e:\SD.Next\venv"
23:57:56-607782 INFO     Version: app=sd.next updated=2024-12-31 hash=dcfc9f3f branch=master
                         url=https://github.com/vladmandic/automatic/tree/master ui=main
23:57:57-334644 INFO     Platform: arch=AMD64 cpu=Intel64 Family 6 Model 165 Stepping 3, GenuineIntel system=Windows
                         release=Windows-10-10.0.19045-SP0 python=3.10.6 docker=False
23:57:57-339611 DEBUG    Packages: venv=venv site=['venv', 'venv\\lib\\site-packages']
23:57:57-344275 INFO     Args: ['--use-cuda', '--use-xformers', '--models-dir', 'e:\\Models']
23:57:57-346267 DEBUG    Setting environment tuning
23:57:57-348267 DEBUG    Torch allocator: "garbage_collection_threshold:0.80,max_split_size_mb:512"
23:57:57-367541 DEBUG    Torch overrides: cuda=True rocm=False ipex=False directml=False openvino=False zluda=False
23:57:57-374335 INFO     CUDA: nVidia toolkit detected
23:58:01-353036 INFO     Install: verifying requirements
23:58:01-361004 INFO     Verifying packages
23:58:01-451357 DEBUG    Timestamp repository update time: Tue Dec 31 20:29:18 2024
23:58:01-453356 INFO     Startup: standard
23:58:01-455355 INFO     Verifying submodules
23:58:04-841406 DEBUG    Git submodule: extensions-builtin/sd-extension-chainner / main
23:58:05-030299 DEBUG    Git submodule: extensions-builtin/sd-extension-system-info / main
23:58:05-213811 DEBUG    Git submodule: extensions-builtin/sd-webui-agent-scheduler / main
23:58:05-494765 DEBUG    Git detached head detected: folder="extensions-builtin/sdnext-modernui" reattach=main
23:58:05-496762 DEBUG    Git submodule: extensions-builtin/sdnext-modernui / main
23:58:05-685070 DEBUG    Git submodule: extensions-builtin/stable-diffusion-webui-rembg / master
23:58:05-873716 DEBUG    Git submodule: modules/k-diffusion / master
23:58:06-060330 DEBUG    Git submodule: wiki / master
23:58:06-159209 DEBUG    Register paths
23:58:06-301363 DEBUG    Installed packages: 183
23:58:06-304363 DEBUG    Extensions all: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info',
                         'sd-webui-agent-scheduler', 'sdnext-modernui', 'stable-diffusion-webui-rembg']
23:58:06-734188 DEBUG    Extension installer: E:\SD.Next\extensions-builtin\sd-webui-agent-scheduler\install.py
23:58:09-810270 DEBUG    Extension installer: E:\SD.Next\extensions-builtin\stable-diffusion-webui-rembg\install.py
23:58:19-121179 DEBUG    Extensions all: []
23:58:19-124181 INFO     Extensions enabled: ['Lora', 'sd-extension-chainner', 'sd-extension-system-info',
                         'sd-webui-agent-scheduler', 'sdnext-modernui', 'stable-diffusion-webui-rembg']
23:58:19-126180 INFO     Install: verifying requirements
23:58:19-128179 DEBUG    Setup complete without errors: 1735678699
23:58:19-135263 DEBUG    Extension preload: {'extensions-builtin': 0.0, 'extensions': 0.0}
23:58:19-138263 INFO     Command line args: ['--use-cuda', '--use-xformers', '--models-dir', 'e:\\Models']
                         models_dir=e:\Models use_cuda=True use_xformers=True
23:58:19-142377 DEBUG    Env flags: []
23:58:19-144377 DEBUG    Starting module: <module 'webui' from 'e:\\SD.Next\\webui.py'>
23:58:23-612038 INFO     Device detect: memory=12.0 optimization=balanced
23:58:23-618039 DEBUG    Read: file="config.json" json=32 bytes=1395 time=0.000 fn=<module>:load
23:58:23-966160 INFO     Engine: backend=Backend.DIFFUSERS compute=cuda device=cuda attention="xFormers" mode=no_grad
23:58:23-970248 DEBUG    Read: file="html\reference.json" json=62 bytes=32964 time=0.001
                         fn=_call_with_frames_removed:<module>
23:58:24-016308 INFO     Torch parameters: backend=cuda device=cuda config=Auto dtype=torch.bfloat16 context=no_grad
                         nohalf=False nohalfvae=False upcast=False deterministic=False fp16=pass bf16=pass
                         optimization="xFormers"
23:58:24-259845 DEBUG    ONNX: version=1.20.1 provider=CUDAExecutionProvider, available=['AzureExecutionProvider',
                         'CPUExecutionProvider']
23:58:24-435537 INFO     Device: device=NVIDIA GeForce RTX 3060 n=1 arch=sm_90 capability=(8, 6) cuda=12.4 cudnn=90100
                         driver=560.94
23:58:24-967886 INFO     Torch: torch==2.5.1+cu124 torchvision==0.20.1+cu124
23:58:24-971983 INFO     Packages: diffusers==0.33.0.dev0 transformers==4.47.1 accelerate==1.2.1 gradio==3.43.2
23:58:25-100333 DEBUG    Entering start sequence
23:58:25-103020 INFO     Models path: e:\Models
23:58:25-107020 DEBUG    Initializing
23:58:25-110020 DEBUG    Read: file="metadata.json" json=6 bytes=2374 time=0.000 fn=initialize:init_metadata
23:58:25-113134 DEBUG    Huggingface cache: path="C:\Users\Banderlog\.cache\huggingface\hub"
23:58:25-157255 INFO     Available VAEs: path="e:\Models\VAE" items=1
23:58:25-160374 INFO     Available UNets: path="e:\Models\UNET" items=0
23:58:25-162374 INFO     Available TEs: path="e:\Models\Text-encoder" items=0
23:58:25-168375 INFO     Available Models: items=8 safetensors="e:\Models\Stable-diffusion":6
                         diffusers="e:\Models\Diffusers":2 time=0.00
23:58:25-183587 INFO     Available LoRAs: path="e:\Models\Lora" items=0 folders=2 time=0.00
23:58:25-220425 INFO     Available Styles: folder="e:\Models\styles" items=288 time=0.03
23:58:25-316183 INFO     Available Yolo: path="e:\Models\yolo" items=6 downloaded=0
23:58:25-321335 DEBUG    Extensions: disabled=['Lora', 'sdnext-modernui']
23:58:25-325337 INFO     Load extensions
23:58:26-312962 INFO     Extension: script='extensions-builtin\sd-webui-agent-scheduler\scripts\task_scheduler.py'
                         Using sqlite file: extensions-builtin\sd-webui-agent-scheduler\task_scheduler.sqlite3
23:58:26-317962 DEBUG    Extensions init time: 0.99 pulid_ext.py=0.44 sd-webui-agent-scheduler=0.39
23:58:26-330147 DEBUG    Read: file="html/upscalers.json" json=4 bytes=2672 time=0.000 fn=__init__:__init__
23:58:26-334150 DEBUG    Read: file="extensions-builtin\sd-extension-chainner\models.json" json=24 bytes=2719
                         time=0.000 fn=__init__:find_scalers
23:58:26-340146 DEBUG    chaiNNer models: path="e:\Models\chaiNNer" defined=24 discovered=0 downloaded=0
23:58:26-349272 INFO     Available Upscalers: items=52 downloaded=0 user=0 time=0.03 types=['None', 'Lanczos',
                         'Nearest', 'ChaiNNer', 'AuraSR', 'ESRGAN', 'RealESRGAN', 'SCUNet', 'SD', 'SwinIR']
23:58:26-357357 INFO     UI start
23:58:26-359356 INFO     UI theme: type=Standard name="black-teal" available=13
23:58:26-378521 DEBUG    UI theme: css="E:\SD.Next\javascript\black-teal.css" base="sdnext.css" user="None"
23:58:26-383627 DEBUG    UI initialize: txt2img
23:58:26-449283 DEBUG    Networks: page='model' items=69 subfolders=2 tab=txt2img
                         folders=['e:\\Models\\Stable-diffusion', 'e:\\Models\\Diffusers', 'models\\Reference']
                         list=0.03 thumb=0.01 desc=0.00 info=0.00 workers=8
23:58:26-457523 DEBUG    Networks: page='lora' items=0 subfolders=0 tab=txt2img folders=['e:\\Models\\Lora'] list=0.00
                         thumb=0.00 desc=0.00 info=0.00 workers=8
23:58:26-475745 DEBUG    Networks: page='style' items=288 subfolders=1 tab=txt2img folders=['e:\\Models\\styles',
                         'html'] list=0.03 thumb=0.00 desc=0.00 info=0.00 workers=8
23:58:26-483863 DEBUG    Networks: page='embedding' items=0 subfolders=0 tab=txt2img folders=['e:\\Models\\embeddings']
                         list=0.00 thumb=0.00 desc=0.00 info=0.00 workers=8
23:58:26-490990 DEBUG    Networks: page='vae' items=1 subfolders=0 tab=txt2img folders=['e:\\Models\\VAE'] list=0.00
                         thumb=0.00 desc=0.00 info=0.00 workers=8
23:58:26-496992 DEBUG    Networks: page='history' items=0 subfolders=0 tab=txt2img folders=[] list=0.00 thumb=0.00
                         desc=0.00 info=0.00 workers=8
23:58:26-843953 DEBUG    UI initialize: img2img
23:58:27-072022 DEBUG    UI initialize: control models=e:\Models\control
23:58:27-906800 DEBUG    Read: file="ui-config.json" json=0 bytes=2 time=0.000 fn=__init__:read_from_file
23:58:28-513121 DEBUG    Reading failed: E:\SD.Next\html\extensions.json [Errno 2] No such file or directory:
                         'E:\\SD.Next\\html\\extensions.json'
23:58:28-516123 INFO     Extension list is empty: refresh required
23:58:29-261573 DEBUG    Extension list: processed=6 installed=6 enabled=4 disabled=2 visible=6 hidden=0
23:58:29-445202 DEBUG    Root paths: ['e:\\SD.Next']
23:58:29-551752 INFO     Local URL: http://127.0.0.1:7860/
23:58:29-555723 DEBUG    API middleware: [<class 'starlette.middleware.base.BaseHTTPMiddleware'>, <class
                         'starlette.middleware.gzip.GZipMiddleware'>]
23:58:29-561865 DEBUG    API initialize
23:58:29-801423 INFO     [AgentScheduler] Task queue is empty
23:58:29-804427 INFO     [AgentScheduler] Registering APIs
23:58:30-142261 DEBUG    Scripts setup: time=0.533 ['K-Diffusion Samplers:0.128', 'IP Adapters:0.062', 'XYZ
                         Grid:0.061', 'Face: Multiple ID Transfers:0.025', 'Video: VGen Image-to-Video:0.016',
                         'FreeScale: Tuning-Free Scale Fusion:0.016', 'Video: LTX Video:0.015', 'Video:
                         AnimateDiff:0.013', 'Video: CogVideoX:0.013', 'ConsiStory: Consistent Image Generation:0.012',
                         'PuLID: ID Customization:0.012', 'LUT Color grading:0.011', 'Ctrl-X: Controlling Structure and
                         Appearance:0.01', 'Video: Stable Video Diffusion:0.01']
23:58:30-150347 DEBUG    Model metadata: file="metadata.json" no changes
23:58:30-153351 DEBUG    Model requested: fn=run:<lambda>
23:58:30-156349 INFO     Selecting first available checkpoint
23:58:30-158349 DEBUG    Script callback init time: system-info.py:app_started=0.08 task_scheduler.py:app_started=0.36
23:58:30-160348 DEBUG    Save: file="config.json" json=32 bytes=1346 time=0.001
23:58:30-162441 INFO     Startup time: 7.15 torch=1.96 libraries=0.13 samplers=0.05 detailer=0.10 extensions=0.99
                         ui-networks=0.32 ui-txt2img=0.32 ui-img2img=0.18 ui-control=0.32 ui-extras=0.07 ui-models=0.29
                         ui-gallery=0.05 ui-settings=0.57 ui-extensions=0.83 ui-defaults=0.09 launch=0.18 api=0.14
                         app-started=0.45
23:58:38-793537 INFO     Settings: changed=1 ['huggingface_token']
23:58:38-795538 DEBUG    Save: file="config.json" json=33 bytes=1410 time=0.002
23:58:40-084538 INFO     API None 200 http/1.1 GET /sdapi/v1/progress 127.0.0.1 0.002
23:58:40-323968 WARNING  Server shutdown requested
23:58:41-240654 INFO     Server restarting...
23:58:41-518502 INFO     Server will restart
23:58:44-442734 DEBUG    Memory: 0.95/31.89 collected=6056
23:58:44-448402 DEBUG    Starting module: <module 'webui' from 'e:\\SD.Next\\webui.py'>
23:58:44-450402 DEBUG    Entering start sequence
23:58:44-452535 INFO     Models path: e:\Models
23:58:44-457535 DEBUG    Initializing
23:58:44-459536 DEBUG    Huggingface cache: path="C:\Users\Banderlog\.cache\huggingface\hub"
23:58:44-462649 INFO     Available VAEs: path="e:\Models\VAE" items=1
23:58:44-465650 INFO     Available UNets: path="e:\Models\UNET" items=0
23:58:44-467650 INFO     Available TEs: path="e:\Models\Text-encoder" items=0
23:58:44-475036 INFO     Available Models: items=8 safetensors="e:\Models\Stable-diffusion":6
                         diffusers="e:\Models\Diffusers":2 time=0.00
23:58:44-479668 INFO     Available LoRAs: path="e:\Models\Lora" items=0 folders=2 time=0.00
23:58:44-514401 INFO     Available Styles: folder="e:\Models\styles" items=288 time=0.03
23:58:44-517989 INFO     Available Yolo: path="e:\Models\yolo" items=6 downloaded=0
23:58:44-522151 DEBUG    Extensions: disabled=['Lora', 'sdnext-modernui']
23:58:44-524149 INFO     Load extensions
23:58:44-630688 DEBUG    Extensions init time: 0.10
23:58:44-638478 DEBUG    Read: file="extensions-builtin\sd-extension-chainner\models.json" json=24 bytes=2719
                         time=0.000 fn=__init__:find_scalers
23:58:44-644653 DEBUG    chaiNNer models: path="e:\Models\chaiNNer" defined=24 discovered=0 downloaded=0
23:58:44-649653 INFO     Available Upscalers: items=52 downloaded=0 user=0 time=0.01 types=['None', 'Lanczos',
                         'Nearest', 'AuraSR', 'ESRGAN', 'RealESRGAN', 'SCUNet', 'SD', 'SwinIR', 'ChaiNNer']
23:58:44-655396 INFO     UI start
23:58:44-657395 INFO     UI theme: type=Standard name="black-teal" available=13
23:58:44-675039 DEBUG    UI theme: css="E:\SD.Next\javascript\black-teal.css" base="sdnext.css" user="None"
23:58:44-679011 DEBUG    UI initialize: txt2img
23:58:44-746062 DEBUG    Networks: page='model' items=69 subfolders=2 tab=txt2img
                         folders=['e:\\Models\\Stable-diffusion', 'e:\\Models\\Diffusers', 'models\\Reference']
                         list=0.03 thumb=0.01 desc=0.00 info=0.00 workers=8
23:58:44-754863 DEBUG    Networks: page='lora' items=0 subfolders=0 tab=txt2img folders=['e:\\Models\\Lora'] list=0.00
                         thumb=0.00 desc=0.00 info=0.00 workers=8
23:58:44-768952 DEBUG    Networks: page='style' items=288 subfolders=1 tab=txt2img folders=['e:\\Models\\styles',
                         'html'] list=0.04 thumb=0.00 desc=0.00 info=0.00 workers=8
23:58:44-777053 DEBUG    Networks: page='embedding' items=0 subfolders=0 tab=txt2img folders=['e:\\Models\\embeddings']
                         list=0.00 thumb=0.00 desc=0.00 info=0.00 workers=8
23:58:44-784993 DEBUG    Networks: page='vae' items=1 subfolders=0 tab=txt2img folders=['e:\\Models\\VAE'] list=0.01
                         thumb=0.00 desc=0.00 info=0.00 workers=8
23:58:44-789980 DEBUG    Networks: page='history' items=0 subfolders=0 tab=txt2img folders=[] list=0.00 thumb=0.00
                         desc=0.00 info=0.00 workers=8
23:58:44-985301 DEBUG    UI initialize: img2img
23:58:45-200663 DEBUG    UI initialize: control models=e:\Models\control
23:58:45-774479 DEBUG    Read: file="ui-config.json" json=0 bytes=2 time=0.001 fn=__init__:read_from_file
23:58:46-205014 DEBUG    Reading failed: E:\SD.Next\html\extensions.json [Errno 2] No such file or directory:
                         'E:\\SD.Next\\html\\extensions.json'
23:58:46-209019 INFO     Extension list is empty: refresh required
23:58:47-002795 DEBUG    Extension list: processed=6 installed=6 enabled=4 disabled=2 visible=6 hidden=0
23:58:47-182628 DEBUG    Root paths: ['e:\\SD.Next']
23:58:47-488002 INFO     Local URL: http://127.0.0.1:7860/
23:58:47-491253 DEBUG    API middleware: [<class 'starlette.middleware.base.BaseHTTPMiddleware'>, <class
                         'starlette.middleware.gzip.GZipMiddleware'>]
23:58:47-494254 DEBUG    API initialize
23:58:47-608586 INFO     [AgentScheduler] Task queue is empty
23:58:47-611734 INFO     [AgentScheduler] Registering APIs
23:58:47-691473 INFO     API None 200 http/1.1 GET /sdapi/v1/progress 127.0.0.1 0.0267
23:58:47-754620 DEBUG    Scripts setup: time=0.909 ['K-Diffusion Samplers:0.129', 'XYZ Grid:0.12', 'IP Adapters:0.11',
                         'Face: Multiple ID Transfers:0.049', 'FreeScale: Tuning-Free Scale Fusion:0.031', 'Video:
                         CogVideoX:0.029', 'Video: VGen Image-to-Video:0.028', 'Video: LTX Video:0.027', 'Video:
                         AnimateDiff:0.025', 'ConsiStory: Consistent Image Generation:0.024', 'LUT Color
                         grading:0.023', 'PuLID: ID Customization:0.022', 'Ctrl-X: Controlling Structure and
                         Appearance:0.02', 'Video: Stable Video Diffusion:0.019', 'Style Aligned Image
                         Generation:0.019', 'Video: Hunyuan Video:0.016', 'HDR: High Dynamic Range:0.015', 'Prompt
                         matrix:0.015', 'Video: ModelScope:0.015', 'Prompt enhance:0.013', 'Prompts from file:0.012',
                         'LEdits: Limitless Image Editing:0.011', 'InstantIR: Image Restoration:0.01', 'Video: Mochi.1
                         Video:0.01', 'DemoFusion: High-Resolution Image Generation:0.01']
23:58:47-766856 DEBUG    Model metadata: file="metadata.json" no changes
23:58:47-769194 DEBUG    Model requested: fn=run:<lambda>
23:58:47-772399 INFO     Selecting first available checkpoint
23:58:47-775510 DEBUG    Script callback init time: system-info.py:app_started=0.17 task_scheduler.py:app_started=0.52
23:58:47-777530 DEBUG    Save: file="config.json" json=33 bytes=1410 time=0.002
23:58:47-778515 INFO     Startup time: 17.60 ldm=14.28 extensions=0.10 ui-networks=0.30 ui-txt2img=0.17 ui-img2img=0.17
                         ui-control=0.30 ui-extras=0.06 ui-models=0.07 ui-settings=0.40 ui-extensions=0.87
                         ui-defaults=0.09 launch=0.38 app-started=0.25
23:58:50-902061 INFO     API None 200 http/1.1 GET /sdapi/v1/motd 127.0.0.1 0.7481
23:58:55-641958 INFO     Browser session: user=None client=127.0.0.1 agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64;
                         rv:128.0) Gecko/20100101 Firefox/128.0
23:58:55-643959 INFO     API None 200 http/1.1 GET /sdapi/v1/sd-models 127.0.0.1 0.0051
23:58:55-648958 INFO     API None 200 http/1.1 GET /sdapi/v1/start 127.0.0.1 0.008
23:58:57-359124 INFO     UI: ready time=8.854
23:59:11-983718 INFO     Load model: select="Diffusers\Disty0/FLUX.1-dev-qint8 [fd65655d4d]"
23:59:11-989718 DEBUG    Load model: type=FLUX model="Diffusers\Disty0/FLUX.1-dev-qint8" repo="Disty0/FLUX.1-dev-qint8"
                         unet="None" te="None" vae="Automatic" quant=qint8 offload=model dtype=torch.bfloat16
23:59:12-917441 INFO     HF login: token="C:\Users\Banderlog\.cache\huggingface\token"
23:59:13-143711 DEBUG    GC: current={'gpu': 1.03, 'ram': 1.01, 'oom': 0} prev={'gpu': 1.03, 'ram': 1.01} load={'gpu':
                         9, 'ram': 3} gc={'gpu': 0.0, 'py': 9201} fn=load_diffuser_force:load_flux why=force time=0.22
23:59:13-442680 DEBUG    Quantization: type=quanto version=0.2.6 fn=load_flux:load_flux_quanto

Dec 31 '24 21:12 jabacrack

After I install SD.Next from scratch, remove previously downloaded model and download it again, I get same error again. Model "Disty0/FLUX.1-dev-qint8"

00:51:24-856388 ERROR    Exception: local variable 'attn_output' referenced before assignment
00:51:24-856388 ERROR    Arguments: args=('task(w8l549ppizl9jth)', '', 'mario brothers Mario and Luigi', '', [], 20, 0,
                         40, True, False, False, False, 1, 1, 6, 6, 0.7, 0, 0.5, 1, 1, -1.0, -1.0, 0, 0, 0, 1024, 1024,
                         False, 0.3, 1, 1, 'Add with forward', 'None', False, 20, 0, 0, 20, 0, '', '', 0, 0, 0, 0,
                         False, 4, 0.95, False, 0.6, 1, '#000000', 0, [], 0, 1, False, 'None', 'None', 'None', 'None',
                         0.5, 0.5, 0.5, 0.5, None, None, None, None, False, False, False, False, 0, 0, 0, 0, 1, 1, 1,
                         1, None, None, None, None, False, '', False, 0, '', [], 0, '', [], 0, '', [], False, True,
                         False, True, False, False, False, False, 0, False, 'None', 2, True, 1, 0, 1, -0.5, 0, '', '',
                         '', 0.5, True, True, False, True, True, False, '0.6, 0.4, 1.1, 1.2', '10, 20, 0.8', False, '',
                         0.5, 5, None, '', 0.5, 5, None, 3, 1, 1, 0.8, 8, 64, True, 'None', [], 'FaceID Base', True,
                         True, 1, 1, 1, 0.5, True, 'person', 1, 0.5, True, 2, True, 1, 35, True, 1, 0.75, True, 2,
                         0.75, False, 3, 0.75, False, 4, 0.75, 0.65, True, False, 1, 1, 1, '', True, 0.5, 600.0, 1.0,
                         True, None, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0.5, 0.5, 'OpenGVLab/InternVL-14B-224px', False,
                         0.7, 1.2, 128, False, False, 'positive', 'comma', 0, False, False, '', [], 0.8, 20,
                         'dpmpp_sde', 'v2', False, True, 'v1.1', 'None', '', 1, '', 'None', 1, '7,8,9', 1, 0.01, 0.2,
                         None, '', False, ['attention', 'adain_queries', 'adain_keys'], 1, 0, 0, True, 10, 'None', 16,
                         'None', 1, True, 'None', 2, True, 1, 0, True, 'none', 3, 4, 0.25, 0.25, 'THUDM/CogVideoX-2b',
                         'DDIM', 49, 6, 'balanced', True, 'None', 8, True, 1, 0, None, None, 45, 16, True, 'Describe
                         the video by detailing the following aspects:\n1. The main content and theme of the video.\n2.
                         The color, shape, size, texture, quantity, text, and spatial relationships of the objects.\n3.
                         Actions, events, behaviors temporal relationships, physical movement changes of the
                         objects.\n4. Background environment, light, style and atmosphere.\n5. Camera angles,
                         movements, and transitions used in the video.\n6. Thematic and aesthetic concepts associated
                         with the scene, i.e. realistic, futuristic, fairy tale, etc.\n', 'None', 2, True, 1, 0,
                         '0.9.1', '', 'diffusers', True, 41, 'None', 2, True, 1, 0, False, 0.03, 45, 'None', 2, True,
                         1, 0, 'None', True, 0, 'None', 2, True, 1, 0, 0, '', [], 0, '', [], 0, '', [], False, True,
                         False, True, False, False, False, False, 0, False, 'None', 2, True, 1, 0) kwargs={}
00:51:24-902723 ERROR    gradio call: UnboundLocalError
┌───────────────────────────────────────── Traceback (most recent call last) ─────────────────────────────────────────┐
│ E:\SD.Next\modules\call_queue.py:31 in f                                                                            │
│                                                                                                                     │
│   30 │   │   │   try:                                                                                               │
│ > 31 │   │   │   │   res = func(*args, **kwargs)                                                                    │
│   32 │   │   │   │   progress.record_results(id_task, res)                                                          │
│                                                                                                                     │
│ E:\SD.Next\modules\txt2img.py:93 in txt2img                                                                         │
│                                                                                                                     │
│    92 │   if processed is None:                                                                                     │
│ >  93 │   │   processed = processing.process_images(p)                                                              │
│    94 │   processed = scripts.scripts_txt2img.after(p, processed, *args)                                            │
│                                                                                                                     │
│ E:\SD.Next\modules\processing.py:210 in process_images                                                              │
│                                                                                                                     │
│   209 │   │   │   with context_hypertile_vae(p), context_hypertile_unet(p):                                         │
│ > 210 │   │   │   │   processed = process_images_inner(p)                                                           │
│   211                                                                                                               │
│                                                                                                                     │
│ E:\SD.Next\modules\processing.py:337 in process_images_inner                                                        │
│                                                                                                                     │
│   336 │   │   │   │   │   from modules.processing_diffusers import process_diffusers                                │
│ > 337 │   │   │   │   │   samples = process_diffusers(p)                                                            │
│   338 │   │   │   │   else:                                                                                         │
│                                                                                                                     │
│ E:\SD.Next\modules\processing_diffusers.py:449 in process_diffusers                                                 │
│                                                                                                                     │
│   448 │   if 'base' not in p.skip:                                                                                  │
│ > 449 │   │   output = process_base(p)                                                                              │
│   450 │   else:                                                                                                     │
│                                                                                                                     │
│ E:\SD.Next\modules\processing_diffusers.py:101 in process_base                                                      │
│                                                                                                                     │
│   100 │   │   else:                                                                                                 │
│ > 101 │   │   │   output = shared.sd_model(**base_args)                                                             │
│   102 │   │   if isinstance(output, dict):                                                                          │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\torch\utils\_contextlib.py:116 in decorate_context                                │
│                                                                                                                     │
│   115 │   │   with ctx_factory():                                                                                   │
│ > 116 │   │   │   return func(*args, **kwargs)                                                                      │
│   117                                                                                                               │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\diffusers\pipelines\flux\pipeline_flux.py:889 in __call__                         │
│                                                                                                                     │
│   888 │   │   │   │                                                                                                 │
│ > 889 │   │   │   │   noise_pred = self.transformer(                                                                │
│   890 │   │   │   │   │   hidden_states=latents,                                                                    │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                             │
│                                                                                                                     │
│   1735 │   │   else:                                                                                                │
│ > 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                          │
│   1737                                                                                                              │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl                                     │
│                                                                                                                     │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                      │
│ > 1747 │   │   │   return forward_call(*args, **kwargs)                                                             │
│   1748                                                                                                              │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\accelerate\hooks.py:170 in new_forward                                            │
│                                                                                                                     │
│   169 │   │   else:                                                                                                 │
│ > 170 │   │   │   output = module._old_forward(*args, **kwargs)                                                     │
│   171 │   │   return module._hf_hook.post_forward(module, output)                                                   │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\diffusers\models\transformers\transformer_flux.py:522 in forward                  │
│                                                                                                                     │
│   521 │   │   │   else:                                                                                             │
│ > 522 │   │   │   │   encoder_hidden_states, hidden_states = block(                                                 │
│   523 │   │   │   │   │   hidden_states=hidden_states,                                                              │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\torch\nn\modules\module.py:1736 in _wrapped_call_impl                             │
│                                                                                                                     │
│   1735 │   │   else:                                                                                                │
│ > 1736 │   │   │   return self._call_impl(*args, **kwargs)                                                          │
│   1737                                                                                                              │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\torch\nn\modules\module.py:1747 in _call_impl                                     │
│                                                                                                                     │
│   1746 │   │   │   │   or _global_forward_hooks or _global_forward_pre_hooks):                                      │
│ > 1747 │   │   │   return forward_call(*args, **kwargs)                                                             │
│   1748                                                                                                              │
│                                                                                                                     │
│ e:\SD.Next\venv\lib\site-packages\diffusers\models\transformers\transformer_flux.py:193 in forward                  │
│                                                                                                                     │
│   192 │   │   # Process attention outputs for the `hidden_states`.                                                  │
│ > 193 │   │   attn_output = gate_msa.unsqueeze(1) * attn_output                                                     │
│   194 │   │   hidden_states = hidden_states + attn_output                                                           │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
UnboundLocalError: local variable 'attn_output' referenced before assignment
00:51:25-936179 DEBUG    GC: current={'gpu': 12.0, 'ram': 6.86, 'oom': 0} prev={'gpu': 12.0, 'ram': 6.86} load={'gpu':
                         100, 'ram': 22} gc={'gpu': 0.0, 'py': 510} fn=f:end why=threshold time=0.23

Dec 31 '24 21:12 jabacrack

Not related, but with a 3000-series GPU, you shouldn't be using --xformers. 3000- and 4000-series GPUs have the processing power to take advantage of the much faster SDP (Scale Dot Product) Cross Attention Method.

Jan 02 '25 20:01 brknsoul

@brknsoul Thanks, I didn't know that. Is it somehow turned on separately or does it not require any action from me?

Jan 02 '25 20:01 jabacrack

just dont use --xformers command line option. in general, if you're not sure about any specific options, don't use them. even -use-cuda should not be needed normally as its auto-detected.

Jan 02 '25 21:01 vladmandic

Just in case it might refer. I have build sd.next with cuda support from the Dockerfile. I added FLUX1.dev-qint4 model from the right "Model -> ALL" tab. The model doesn't load with the following error:

11:09:45-124127 INFO     Load model: select="Diffusers/Disty0/FLUX.1-dev-qint4  
                        [82811df42b]"                                          
11:09:45-127654 DEBUG    Load model: type=FLUX                                  
                        model="Diffusers/Disty0/FLUX.1-dev-qint4"              
                        repo="Disty0/FLUX.1-dev-qint4" unet="None" te="None"   
                        vae="Automatic" quant=qint4 offload=balanced           
                        dtype=torch.bfloat16                                   
11:09:45-128570 DEBUG    HF login: no token provided                            
11:09:45-344689 DEBUG    GC: current={'gpu': 0.22, 'ram': 1.19, 'oom': 0}       
                        prev={'gpu': 0.22, 'ram': 1.19} load={'gpu': 2, 'ram': 
                        2} gc={'gpu': 0.0, 'py': 8098}                         
                        fn=load_diffuser_force:load_flux why=force time=0.22   
11:09:45-701086 DEBUG    Quantization: type=quanto version=0.2.6                
                        fn=load_flux:load_flux_quanto                          
11:09:45-702249 ERROR    Quantization: type=quanto offload=balanced not         
                        supported                                              
11:09:58-398634 ERROR    Load model: type=FLUX failed to load Quanto            
                        transformer: Could not run                             
                        'aten::_convert_weight_to_int4pack' with arguments from
                        the 'CPU' backend. This could be because the operator  
                        doesn't exist for this backend, or was omitted during  
                        the selective/custom build process (if using custom    
                        build). If you are a Facebook employee using PyTorch on
                        mobile, please visit https://fburl.com/ptmfixes for    
                        possible resolutions.                                  
                        'aten::_convert_weight_to_int4pack' is only available  
                        for these backends: [CUDA, Meta, BackendSelect, Python,

I'm not sure, why CPU is being used as a backend. Can I somehow switch it CUDA by some configuration?

Apr 01 '25 11:04 ik666

quanto is not available on all platforms - you did not specify what's your gpu and backend.

regarding cpu, its always used first with balanced offload and then model components are moved to gpu as needed. without that, there is no way to fit entire model on gpu.

Apr 01 '25 11:04 vladmandic

Ah ok, I see.

My host platform is Docker on current Ubuntu LTS. Just in case it helps :

device: NVIDIA GeForce RTX 4070 Ti (1) (sm_90) (8, 9)
cuda: 12.6
cudnn: 90501
driver: 535.183.01

arch: x86_64
cpu: x86_64
system: Linux
release: 6.8.0-57-generic
python: 3.11.11

active: cuda
dtype: torch.bfloat16
vae: torch.bfloat16
unet: torch.bfloat16

Apr 02 '25 13:04 ik666

Does this still happen with latest version?

Jun 11 '25 07:06 Disty0

_convert_weight_to_int4pack issue should be fixed now.
Alos, xFormers is only compatible with SD 1.5 and SDXL. Added a warning and disabled xFormers for other models in dev branch.

Jun 13 '25 10:06 Disty0

automatic automatic copied to clipboard

[Issue]: FLUX qint load failed

Issue Description

Version Platform Description

Relevant log output

Backend

UI

Branch

Model

Acknowledgements

automatic
automatic copied to clipboard