Salvatore Rossitto

Results 4 comments of Salvatore Rossitto

Hi, I didn't test it with different params yet. I still evaluating it... On Fri, Oct 6, 2023, 17:32 Forkoz ***@***.***> wrote: > Does this work for Kquants + offloading...

In theory yes, but I tried and didn't work to me. Maybe you can retry if the version of llama.cpp has changed. It should be in the code as commented....

I have implemented a version that looks at OLLAMA_KV_CACHE_TYPE env var to apply the quantization to all models, its working good for me, i will keep using this until there...

Any new node have been implemented in his own file and then referenced in the init.py