quivr
quivr copied to clipboard
Desktop app
Desktop app of the webapp.
Need to think about how we proceed when there is no internet -> Local LLMs and embedding or full API ?
It might be interesting to see LLama.cpp for local model
It might be interesting to see LLama.cpp for local model
I manually modified the code (I haven't forked or committed yet) and managed to make it work with llama.cpp, I used the gpt4all 13b model.
It lends itself well to this kind of activity.
Unfortunately I have to say that the performance is not good, the model is not able to respond adequately as gpt3 would, and to receive an answer it takes a long time. Despite my computer having an rtx 2060 and 16 cores of ryzen 7 4800h. As soon as I have time I will upload a fork of the repository with llama support.
But don't have much hope that it will work well. I think maybe a 66B model could go a little better
Yes. Right now, gpt4all on CPU is not showing convincing performance, unfortunately.
Oh sorry missed your comment.
Yeah for now it is not efficient enough. However I have good faith that we will find an answer to that with for example "shared nodes" for consumer people and just a bunch of cpus for companies 😂