private-gpt qdrant RAM usage (?allow quantisation and memmap)

hello, I am attempting to create a qdrant database with a large collection of medical guidelines. Unfortunately although the raw text is only 800MB the database rapidly grows to 15+ GB before the privateGPT instances crashes. The RAM usage goes to 40-50GB+. I am aware that qdrant supports quantisation and keeping the database on disk instead of in RAM, however I cannot seem to get it working with privateGPT (the way the vectorstore is loaded is not the same as in the qdrant documentation) is it possible to allow these features?

Apr 28 '24 05:04 Smensink

If the collection has been created, you can update the collection to use quantization and on-disk vectors. It is documented at https://qdrant.tech/documentation/concepts/collections/#update-collection-parameters.

There's also a tutorial about loading a large amount of vectors. https://qdrant.tech/documentation/tutorials/bulk-upload/#bulk-upload-a-large-number-of-vectors

Apr 28 '24 09:04 Anush008

huge, thats very helpful thanks, I'll try it tomorrow. Is there any plan to include these options natively in the repo?

Apr 28 '24 09:04 Smensink

Is there any plan to include these options natively in the repo?

I am unsure. But it would make sense to call them externally since they're not very common operations.

Apr 28 '24 11:04 Anush008

it works!, still more ram usage than i wouldve thought, but seems to level out around 18gb

Apr 29 '24 11:04 Smensink

I'd also recommend referring to https://qdrant.tech/documentation/guides/optimize/.

Apr 29 '24 11:04 Anush008