Wagner Bruna

Results 84 comments of Wagner Bruna

@JustMaier , I'd like to tackle the model loading optimizations. Taking a look at the code, I believe there's room for improvement regardless of the backend, but I'd need to...

@JustMaier , I opened a discussion about the model loading optimization: #789 .

Since my 3400G seems to behave the same for any GGML_VK_DEVICE0_MEMORY value above 0, I guess this only matters for splitting across devices. But if that's the case, perhaps it'd...

Could you include a log with `-v` of the command that crashes? Also, does it happen with any LoRA? Could you point me to the one you're testing with?

> I get all the way past the model loading and when I get ti the generation process I keep getting this error: > > ggml_new_object: not enough space in...

> Does the problem still exist? Maybe not: `master-338-faabc5a` included a memory-related ggml fix: ggml-org/llama.cpp#16679 . @evcharger , could you please test again with a more recent release?

> It did work by building the bins but is very slow using chroma. Around 220s/it > On comfy I get under 30s/it sometimes even less. I don't know if...

Works for me... kind of 🙂 I don't think SD_VK_DEVICE can be a _replacement_ for GGML_VK_VISIBLE_DEVICES: ``` SD_VK_DEVICE=0 ./sd (...) ggml_vulkan: Found 1 Vulkan devices: ggml_vulkan: 0 = AMD Radeon...

That 'B' stuff on the regex took me a bit too long to understand, so a comment would've been appreciated 🙂 Apart from that, the code looks good to me,...

@leejet , just to let you know: this PR has been included in Koboldcpp since a few releases ago (1.94 - 1.96.2) to fix #588, with no issues reported so...