PSeitz

Results 357 comments of PSeitz

There's nothing off the shelf, but in general it should be feasible. If you want to implement it, I would start from [phrase_query](https://github.com/quickwit-oss/tantivy/blob/main/src/query/phrase_query/phrase_query.rs).

I think that debug_assert is too strict ```rust fn seek(&mut self, target: DocId) -> DocId { debug_assert!(self.doc() = target { return self.doc(); } ... } ``` Do you have something...

Related issue https://github.com/launchbadge/sqlx/issues/1896

I think we can reuse the same code we use for the `_field_caps` api. The check in `is_metadata_count_request` is probably returning false but shouldn't

Everything except the term query count special case, which I'm not sure we need

Yes, we don't scan all splits in some cases: ```rust // if client wants full count, or we are doing an aggregation, we want to run every splits. // However...

fast field text fields are currently always zstd compressed. Accessing is pretty expensive, since we always decompress zstd blocks (This will be configurable in the next release with the `columnar-zstd-compression`...

Can you add timings for Approach 1? A flamegraph for each would be helpful to spot anything unusual

Hi, you should be able to control the segment size by setting a custom merge policy https://docs.rs/tantivy/latest/tantivy/indexer/struct.IndexWriter.html#method.set_merge_policy

We don't download the whole segment, it's done by a mix of caching and using sstable as the dictionary.