InvokeAI [bug]: model cache logs negative VRAM requested

Is there an existing issue for this problem?

[x] I have searched the existing issues

Operating system

Linux

GPU vendor

Nvidia (CUDA)

GPU model

RTX 3060

GPU VRAM

12 GB

Version number

5.9

What happened

When generating Flux images, I frequently see messages like this in the log:

[21:15:07,670]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model '924bda16-8c73-4aed-a06c-6216705962ea:transformer' (Flux) onto cuda device in 1.20s. Total model size: 8573.12MB, VRAM: 6444.62MB (75.2%)        
[21:15:38,492]::[InvokeAI]::WARNING --> Loading 0.0 MB into VRAM, but only -283.4375 MB were requested. This is the minimum set of weights in VRAM required to run the model.                                                                          
[21:15:38,494]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model 'c75a604c-a146-44be-a294-36a1842c3f7e:vae' (AutoEncoder) onto cuda device in 0.14s. Total model size: 159.87MB, VRAM: 0.00MB (0.0%)

Two things about that message are weird:

reports a negative number of megabytes were requested?
loaded zero megabytes?

What you expected to happen

not sure what to expect from the model cache's logging

Mar 21 '25 21:03 keturn

InvokeAI version 5.10.1, GTX 3060 12gb I'm seeing a similar negative VRAM request, but with different SDXL models: this happens an unknown but seemingly consistent number of gens between restarts; if I were to guess, around 50 gens. It's been an issue for me since late 2024:

[2025-05-09 10:53:48,195]::[InvokeAI]::INFO --> Executing queue item 31946, session 1350f061-30a5-4249-97f1-6f2a7245d4e4
[2025-05-09 10:53:49,384]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model 'cb09f201-e122-4083-a7ee-2ea26e0985ce:unet' (UNet2DConditionModel) onto cuda device in 1.16s. Total model size: 4897.05MB, VRAM: 4897.05MB (100.0%)
[2025-05-09 10:53:50,578]::[ModelManagerService]::INFO --> [MODEL CACHE] Loaded model 'cb09f201-e122-4083-a7ee-2ea26e0985ce:scheduler' (EDMDPMSolverMultistepScheduler) onto cuda device in 0.00s. Total model size: 0.00MB, VRAM: 0.00MB (0.0%)

  0%|          | 0/34 [00:00<?, ?it/s]
  3%|2         | 1/34 [00:00<00:23,  1.39it/s]
  6%|5         | 2/34 [00:01<00:23,  1.36it/s]
  9%|8         | 3/34 [00:02<00:22,  1.36it/s]
 12%|#1        | 4/34 [00:02<00:21,  1.37it/s]
 15%|#4        | 5/34 [00:03<00:21,  1.37it/s]
 18%|#7        | 6/34 [00:04<00:20,  1.37it/s]
 21%|##        | 7/34 [00:05<00:19,  1.36it/s]
 24%|##3       | 8/34 [00:05<00:19,  1.37it/s]
 26%|##6       | 9/34 [00:06<00:18,  1.36it/s]
 29%|##9       | 10/34 [00:07<00:17,  1.35it/s]
 32%|###2      | 11/34 [00:08<00:16,  1.36it/s]
 35%|###5      | 12/34 [00:08<00:16,  1.36it/s]
 38%|###8      | 13/34 [00:09<00:15,  1.36it/s]
 41%|####1     | 14/34 [00:10<00:14,  1.36it/s]
 44%|####4     | 15/34 [00:11<00:14,  1.36it/s]
 47%|####7     | 16/34 [00:11<00:13,  1.36it/s]
 50%|#####     | 17/34 [00:12<00:12,  1.35it/s]
 53%|#####2    | 18/34 [00:13<00:11,  1.36it/s]
 56%|#####5    | 19/34 [00:13<00:11,  1.36it/s]
 59%|#####8    | 20/34 [00:14<00:10,  1.35it/s]
 62%|######1   | 21/34 [00:15<00:09,  1.36it/s]
 65%|######4   | 22/34 [00:16<00:08,  1.35it/s]
 68%|######7   | 23/34 [00:16<00:08,  1.36it/s]
 71%|#######   | 24/34 [00:17<00:07,  1.36it/s]
 74%|#######3  | 25/34 [00:18<00:06,  1.36it/s]
 76%|#######6  | 26/34 [00:19<00:05,  1.35it/s]
 79%|#######9  | 27/34 [00:19<00:05,  1.36it/s]
 82%|########2 | 28/34 [00:20<00:04,  1.35it/s]
 85%|########5 | 29/34 [00:21<00:03,  1.36it/s]
 88%|########8 | 30/34 [00:22<00:02,  1.36it/s]
 91%|#########1| 31/34 [00:22<00:02,  1.36it/s]
 94%|#########4| 32/34 [00:23<00:01,  1.35it/s]
 97%|#########7| 33/34 [00:24<00:00,  1.36it/s]
100%|##########| 34/34 [00:25<00:00,  1.35it/s]
100%|##########| 34/34 [00:25<00:00,  1.36it/s]
C:\Users\nnn\invokeai\.venv\Lib\site-packages\invokeai\app\invocations\baseinvocation.py:183: PydanticDeprecatedSince211: Accessing the 'model_fields' attribute on the instance is deprecated. Instead, you should access this attribute from the model class. Deprecated in Pydantic V2.11 to be removed in V3.0.
  for field_name, field in self.model_fields.items():
[2025-05-09 10:54:16,629]::[InvokeAI]::WARNING --> Loading 0.0 MB into VRAM, but only -1205.203125 MB were requested. This is the minimum set of weights in VRAM required to run the model.
Process exited with code: 0

May 11 '25 12:05 firesign

From Ryan in discord: https://discord.com/channels/1020123559063990373/1149506274971631688/1326650031310114816

This warning is usually technically accurate when it pops up, but I can definitely see why it would be confusing. It means that we would ideally like to use 600MB less VRAM to be safe, but couldn't find a way to offload those weights safely. How much VRAM do you have?

Aug 25 '25 03:08 psychedelicious

InvokeAI InvokeAI copied to clipboard

[bug]: model cache logs negative VRAM requested

Is there an existing issue for this problem?

Operating system

GPU vendor

GPU model

GPU VRAM

Version number

What happened

What you expected to happen

InvokeAI
InvokeAI copied to clipboard