genai-stack genai-stack runs very slow when RAG is activated

genai-stack runs very slow when RAG is activated

Open wishatch opened this issue 1 year ago • 5 comments

genai-stack works well at reasonable speed without RAG. But, when RAG is activated it runs very slow. Any advice on how to solve this? Thx

Nov 15 '23 04:11 wishatch

What LLM are you using? Is it faster if you switch to a smaller one, or OpenAI one? It's expected for it to run slower because the LLM gets fed more tokens.

Nov 15 '23 07:11 oskarhane

I am using -llama2 7b -Ubuntu 22.04 LTS -Docker Desktop Windows (wsl2 enabled) v4.25.0 -very highend PC server power -all the rest is per default configuration from github repo (no graphic card configuration)

Nov 15 '23 22:11 wishatch

I am experiencing the same issue, and wonder if there is any guide available to improve/benchmark the performance.

Dec 03 '23 20:12 JasonPad19

Make sure you're running on GPU.

Dec 04 '23 10:12 oskarhane

Make sure you're running on GPU.

Is there a way to minimize the configuration of genai-stack, so that it runs reasonable speed without GPU (doesn't need to be super fast). GPU is expensive. It will be good if I can get familiar with this stack first before purchasing GPU card. Thx much for advice.

Dec 04 '23 21:12 wishatch

genai-stack genai-stack copied to clipboard

genai-stack runs very slow when RAG is activated

genai-stack
genai-stack copied to clipboard