Hirzinger Robert
Hirzinger Robert
Hi Krish, thank you for the quick reply. I was not able to find the /api/generate in the Swagger of Litellm (https://litellm-api.up.railway.app/) Contiune.dev tries to directly contact url:port**/api/generate** when selecting...
Thx for the fix, I've tested it with the lates stable version. It is forwarding the prompt to ollama to **:11434/api/generate** But it directly crashes my Ollama instance with Qwen2.5-Code-3B...
Only models that are trained for FIM are compatible, for autocomplete you want a small fast model (qwen2.5-coder-3b works well) and for Chat you ask a bigger model, like codestral...
Hi @galvanoid, we have the same issues with 70K files, starting to get worse with 10K files. There is always a noticable delay before the Request is sent to the...
My assumption is that the PostgreSQL for OWUI is causing the slowdown. Is there any recommendation for the PostgreSQL config ? Thx
Thx for the quick response. I am currently testing the PGvector seems to be fine here, i can retrieve Vectors within seconds, it seems that the file handling or the...
I've now tested direct queries to my PGvector DB, I get a response within 2-3 sec. (Limit 30 / L2 distance / 1024 Dimensions / 70K docs). When querying for...
I did some checks on the backend DB (pg:17) I checked the requests from OWUI on opening the Workspace -> KCs : I saw the following select (could not display...
Wow thx Ricardo for your detailed reply and your remommendations to that topic. I am not a programmer, i probably need another day or two to fully undestand all your...
Hey guys, thx for the Update! I've just tested 0.6.33 on my Test-Env. and **i can confirm that it loads collections much faster in Workspace and chats too.** I'll confirm...