elasticsearch Re-arrange document id bytes to take advantage of shared prefixes

Re-arrange document id bytes to take advantage of shared prefixes

Open salvatore-campagna opened this issue 1 year ago • 12 comments

Here we change the auto generated document id structure so to favour compression of the Lucene terms dictionary used to store document ids. The idea is to generate the document id including never-changing or slowly-changing bits first so that, when storing document ids, large sequences of values share the same prefix.

Jan 24 '24 10:01 salvatore-campagna

elasticsearch elasticsearch copied to clipboard

Re-arrange document id bytes to take advantage of shared prefixes

elasticsearch
elasticsearch copied to clipboard