InvokeAI Fix memory issues when installing models on Windows

Summary

Windows treats file handles differently than MacOS and Linux - Model Manager v3 has exposed some issues here, so here's a few patches to solve this. I wrote more but then the power went out, and I'm not rewriting all of that, but I wanted to make sure to thank @skunkworxdark for catching and fixing a serious memory issue.

Related Issues / Discussions

Closes: #8644 #8636 #8628 #8627

QA Instructions

To test, import various models from Starter Models and elsewhere: GGUF files, Safetensors files, particularly large ones, and Diffusers multi-file models. This fix is specifically for Windows, but the changes should benefit every OS and not break anything on MacOS and Linux. Tested on MacOS and Windows so far.

Merge Plan

Changes the GGUF loader, and import_state_dict in model_on_disk.py - shouldn't conflict with anything; mostly small/remote changes.

Checklist

[x] The PR has a short but descriptive title, suitable for a changelog
[ ] Tests added / updated (if applicable) (n/a)
[ ] ❗Changes to a redux slice have a corresponding migration (not sure what this is)
[ ] Documentation added / updated (if applicable) (n/a)
[ ] Updated What's New copy (if doing a release after this PR)

Nov 03 '25 22:11 gogurtenjoyer

I'm sorry but I can't seem to appease ruff here, even when configuring mine with this repo's pyproject.toml. Edit: fixed 👍

Nov 03 '25 23:11 gogurtenjoyer

@gogurtenjoyer a few equestions that are more related to Model Manager v3 than to your PR, but I encountered them while testing on your PR.

One of my test cases is installing a diffusers-style FLUX LoRA . It installs and is recognized as a diffusers LoRA, but when I try to use it I'm getting a "Is a directory" error. Is this type of LoRA known not to work? It happens on both the main branch and your PR, so it isn't a PR issue, but I wonder if it is a regression that was introduced recently. I thought we did support diffusers LoRAs.
Are these FLUX LoRA .safetensors models supposed to work? https://huggingface.co/XLabs-AI/flux-lora-collection/tree/main . The model probe can't seem to identify them.
MMv3 is not identifying FLUX quantized .gguf models either, for example: https://huggingface.co/city96/AuraFlow-v0.3-gguf/tree/main
And these quantized .gguf models are correctly recognized and installed, but cause a core dump when executed: https://huggingface.co/wikeeyang/SRPO-Refine-Quantized-v1.0/tree/main

Are these known issues with FLUX models?

Nov 08 '25 19:11 lstein

@lstein - thanks, I'll try that flux schnell model to see what's going on. For the other questions:

I've actually never used a diffusers style lora and have only ever used single file loras, so I'm not sure there.
Invoke hasn't ever supported AuraFlow, so no change there.
Same as above; this isn't technically a Flux model and Invoke doesn't support it.
Never heard of this model before - is this a similar situation to AuraFlow?

Nov 08 '25 20:11 gogurtenjoyer

Okay, the Flux Schnell Q2 linked above installed and ran correctly for me on Windows - is the core dump during generating, or the install?

Here's a few more GGUFs to test (and which we used for testing when trying to triage the Windows issue) - the GGUF sizes/versions are in the buttons along the top:

https://civitai.com/models/630820?modelVersionId=944736 https://civitai.com/models/920261?modelVersionId=1030326

Nov 08 '25 20:11 gogurtenjoyer

Suspect the issue is with using _mmap in loaders.py so trying an alternative there. I was bad for attempting this, and thought nestling it within a try/except would make it okay :)

@lstein could you try with this latest fix? @JPPhoto tested as well on Linux (WSL2 I believe!) and it seems to do the trick.

Nov 08 '25 22:11 gogurtenjoyer

Suspect the issue is with using _mmap in loaders.py so trying an alternative there. I was bad for attempting this, and thought nestling it within a try/except would make it okay :)

@lstein could you try with this latest fix? @JPPhoto tested as well on Linux (WSL2 I believe!) and it seems to do the trick.

Working now! The SRPO models (e.g. https://huggingface.co/wikeeyang/SRPO-Refine-Quantized-v1.0/tree/main) are working as well.

Nov 16 '25 14:11 lstein

InvokeAI InvokeAI copied to clipboard

Fix memory issues when installing models on Windows

Summary

Related Issues / Discussions

QA Instructions

Merge Plan

Checklist

InvokeAI
InvokeAI copied to clipboard