Ryan Dick
Ryan Dick
> > @lstein if you're interested in looking into optimizing the existing LoRA patching code it could make sense to start there. Otherwise, I think we could proceed with a...
Just ran the following manual tests: - Speed / compatibility: - Non-quantized: 2.08 it/s - GGUF Q8_0: 1.61 it/s - GGUF Q4_K_S: 1.24 it/s - GGUF Q2_K: 1.27 it/s -...
Setting `force_tiled_decode: true` forces tiling to be used, but the default tile size (when `tile_size=0`) is determined based on the model architecture. For an SDXL model, the default tile size...
I took an initial read through this today. Before I leave any comments on the code, I think some more design discussion is warranted. I'll start the conversation here -...
I'm excited about how this is progressing 🚀 I took a stab at working backwards from this draft PR to a design document so that we can align on some...
I tested this today. As of `v5.6.0rc4`, LoKR models work with most base FLUX models. The one exception is bitsandbytes NF4 quantized base models, which will be addressed in https://github.com/invoke-ai/InvokeAI/pull/7577....
I feel like whether we use xformers should be controlled via the config rather than whether xformers is installed. Is there a reason for doing it this way?
Of all the possible duplicate handling methods, `error` still makes the most sense to me as the default. Maybe we start with `min`, `max`, `error`? That would handle most cases...
Test results on my Apple M3: ### SD1.5, 1024x1024 - torch 2.2.2 sliced: works - torch 2.4.1 sliced: produces noise - torch 2.4.1 non-sliced: maxes out memory and is extremely...