Adrien Grand
Adrien Grand
Tantivy actually looks pretty good. Rust allows it to be closer to hardware than Lucene and Tantivy seems to have some specialization for certain common cases like pure disjunctions of...
Maybe the `Index` ctor (in competition.py) could take a new argument to introduce arbitrary sparsity so that we could compare the sorting/facetirng tasks between a sparse and a dense index,...
We already have eg. sorting tasks that read from doc values. See for instance `HighTermDayOfYearSort` in `tasks/wikimedium.10M.nostopwords.tasks`. Maybe we could tweak the indexing logic to add ways to add arbitrary...
> For random access impact measurement it is poor as it extracts the values sequentially with skip-ahead Doc values have been designed to handle sorting and faceting, so these are...
Sorry I was sure that I replied, but I must have forgotten to hit "Comment" before closing the tab. > Do you consider Solr's export to be abuse? It requires...
FWIW it looks like this feature is using advance() while it should use advanceExact().
> Could be it is just an oversight? advanceExact was added after we switched to doc-value iterators, I believe we just never changed this call site to use this new...
> Is it interesting to measure the performance for faceted and grouped searches with a focus on relatively small result sets? Yes. For intance luceneutil has LowTerm, MedTerm and HighTerm...
I think sorted queries are interesting for that reason: it is a realistic use-case, yet secondary processing is lightweight as most of the time the new document will not be...
I added the release highlight label based on the speedup we observed in nightly benchmarks with this change: https://elasticsearch-benchmarks.elastic.co/index.html#tracks/sql/nightly/default/30d