documentation icon indicating copy to clipboard operation
documentation copied to clipboard

Clarify storage across indexes

Open Toflar opened this issue 3 years ago • 1 comments

Would be great if the docs said something about whether identical documents or parts of the documents are shared across indexes. E.g. if I add the exact same document A to index A as well as B, will it use twice the amount of RAM? Or is it going to be shared? Does it depend on whether the documents are 100% identical or are the fields shared?

That would be great, thanks :)

By the way, happy to provide a PR if someone is able to reply to the questions but I think when answering here, one might as well create a PR right away ;)

Toflar avatar Mar 24 '21 11:03 Toflar

Hey @Toflar , thanks for bringing this up and sorry for the delay in getting back to you. My understanding is that documents are not shared between indexes, so multiple identical documents in different indexes would be treated as completely different documents. This could change if/when we implement multi-index search.

I will add this information in /learn/core_concepts/documents.md#limitations-and-requirements, and possibly in the storage page you linked as well. Thanks again! 🙏

dichotommy avatar Apr 07 '21 08:04 dichotommy

I believe @dichotommy's answer still holds true today: documents are not shared between indexes and will be treated as different documents. v1.1's new /multi-search endpoint did not alter this behaviour.

However, I do not think we should be making any statements regarding RAM usage. As far as I understand it, the size of a document in an index depends on several variables including index settings and other documents present in the database.

@gmourier, can you confirm all of this?

guimachiavelli avatar Apr 27 '23 15:04 guimachiavelli

I believe @dichotommy's answer still holds true today: documents are not shared between indexes and will be treated as different documents. v1.1's new /multi-search endpoint did not alter this behaviour.

Exactly @guimachiavelli

If a document is present in 2 indexes with the exact same structure, nothing is shared on disk.

Each index has its own LMDB environment on disk.

As far as I understand it, the size of a document in an index depends on several variables including index settings and other present documents in the database.

Also true, documents belonging in an index with filterableAttributes being set, for example, will hold more disk space to store the associated structures for that setting to be computed at search-time

cc @macraig

gmourier avatar Apr 27 '23 15:04 gmourier

Brilliant, thanks for the confirmation, @gmourier.

@maryamsulemani97, I'm assigning this to you. Check the comments for the information you need 👍

guimachiavelli avatar May 04 '23 10:05 guimachiavelli