PSeitz issues

Results 117 issues of


PSeitz

Document as trait

Currently the layout and it's behaviour (serialization etc.) for Document is provided by tantivy. Users of tantivy have to convert their structure into the tantivy Document. An alternative approach would...

Single-value fast field for strings fields

For compatibility reasons, we always create a multi value fast field index when creating a fast field on a string field. When the field is effectively single values, we may...

Support Date type in Aggregation

Currently the Date type is filtered as valid fast field type, but this may be overly strict.

Expull: replace read_to_end with iterator over bytes

Expotential Unrolled List read_to_end in expull may consume a lot of memory. Since it is used by the postinglist record, it contains all docids(+optional positions, term frequencies) for one term,...

Fast field datasets

# Datasets For the [fast field codecs](https://github.com/quickwit-oss/tantivy/tree/main/fastfield_codecs) we need to have good datasets to test them. Ideally this are datasets which we would expect to be indexed in a search...

Histogram performance

The histogram performance comparison between the generic solution from (https://github.com/quickwit-oss/tantivy/tree/main/src/aggregation) with the specialized histogram (https://github.com/quickwit-oss/tantivy/blob/main/src/collector/histogram_collector.rs) suggests there is some headroom for improvement The bench collects 1_000_000 docs into 10_000 buckets....

Link to outdated gitbooks page in docs

https://docs.rs/tantivy/latest/tantivy/postings/struct.InvertedIndexSerializer.html links to https://fulmicoton.gitbooks.io/tantivy-doc/content/inverted-index.html, which contains outdated information, e.g. simdcomp. The site looks nice, but we overall there are too many different resources, we should probably deprecate some and concentrate...

PSeitz