quivr
quivr copied to clipboard
Significantly reduce memory footprint down to 2GB or less?
DESCRIPTION: can we significantly reduce the memory footprint of Quivr in general. Right now the two Docker containers "web" and "backend" consumes about 8GB of RAM. On my Macbook Pro M1 with 16GB RAM, with typical other programs running, I only get 2GB free left.
RATIONALE
- I want to also run LLM locally, for example llama.cpp with Vicuna-7B. This will consume another 9-10GB. If I run local LLM in conjunction with Quivr, my LLM is crawling like snail.
- In the future, we would like to have Quivr run on smartphones as well so our second brain will always be with us. Thus reducing memory footprint is even more critical.
- Most of the time, Quivr is not doing anything. It only invokes when I do query or upload. Thus it is not efficient to permanently occupy 8GB of RAM even Quivr is idle most of the time.
IMPLEMENTATION Sorry I'm not a developer, but I guess we need to get out of the Docker architecture?
Thanks for the issue.
What would be possible is for you to run the commands found in the docker files while we look towards optimization in the future