InvokeAI
InvokeAI copied to clipboard
Fix memory issues when installing models on Windows
Summary
Windows treats file handles differently than MacOS and Linux - Model Manager v3 has exposed some issues here, so here's a few patches to solve this. I wrote more but then the power went out, and I'm not rewriting all of that, but I wanted to make sure to thank @skunkworxdark for catching and fixing a serious memory issue.
Related Issues / Discussions
Closes: #8644 #8636 #8628 #8627
QA Instructions
To test, import various models from Starter Models and elsewhere: GGUF files, Safetensors files, particularly large ones, and Diffusers multi-file models. This fix is specifically for Windows, but the changes should benefit every OS and not break anything on MacOS and Linux. Tested on MacOS and Windows so far.
Merge Plan
Changes the GGUF loader, and import_state_dict in model_on_disk.py - shouldn't conflict with anything; mostly small/remote changes.
Checklist
- [x] The PR has a short but descriptive title, suitable for a changelog
- [ ] Tests added / updated (if applicable) (n/a)
- [ ] ❗Changes to a redux slice have a corresponding migration (not sure what this is)
- [ ] Documentation added / updated (if applicable) (n/a)
- [ ] Updated
What's Newcopy (if doing a release after this PR)
I'm sorry but I can't seem to appease ruff here, even when configuring mine with this repo's pyproject.toml. Edit: fixed 👍
@gogurtenjoyer a few equestions that are more related to Model Manager v3 than to your PR, but I encountered them while testing on your PR.
- One of my test cases is installing a diffusers-style FLUX LoRA . It installs and is recognized as a diffusers LoRA, but when I try to use it I'm getting a "Is a directory" error. Is this type of LoRA known not to work? It happens on both the main branch and your PR, so it isn't a PR issue, but I wonder if it is a regression that was introduced recently. I thought we did support diffusers LoRAs.
- Are these FLUX LoRA .safetensors models supposed to work? https://huggingface.co/XLabs-AI/flux-lora-collection/tree/main . The model probe can't seem to identify them.
- MMv3 is not identifying FLUX quantized .gguf models either, for example: https://huggingface.co/city96/AuraFlow-v0.3-gguf/tree/main
- And these quantized .gguf models are correctly recognized and installed, but cause a core dump when executed: https://huggingface.co/wikeeyang/SRPO-Refine-Quantized-v1.0/tree/main
Are these known issues with FLUX models?
@lstein - thanks, I'll try that flux schnell model to see what's going on. For the other questions:
- I've actually never used a diffusers style lora and have only ever used single file loras, so I'm not sure there.
- Invoke hasn't ever supported AuraFlow, so no change there.
- Same as above; this isn't technically a Flux model and Invoke doesn't support it.
- Never heard of this model before - is this a similar situation to AuraFlow?
Okay, the Flux Schnell Q2 linked above installed and ran correctly for me on Windows - is the core dump during generating, or the install?
Here's a few more GGUFs to test (and which we used for testing when trying to triage the Windows issue) - the GGUF sizes/versions are in the buttons along the top:
https://civitai.com/models/630820?modelVersionId=944736 https://civitai.com/models/920261?modelVersionId=1030326
Suspect the issue is with using _mmap in loaders.py so trying an alternative there. I was bad for attempting this, and thought nestling it within a try/except would make it okay :)
@lstein could you try with this latest fix? @JPPhoto tested as well on Linux (WSL2 I believe!) and it seems to do the trick.
Suspect the issue is with using
_mmapin loaders.py so trying an alternative there. I was bad for attempting this, and thought nestling it within a try/except would make it okay :)@lstein could you try with this latest fix? @JPPhoto tested as well on Linux (WSL2 I believe!) and it seems to do the trick.
Working now! The SRPO models (e.g. https://huggingface.co/wikeeyang/SRPO-Refine-Quantized-v1.0/tree/main) are working as well.