PSeitz comments

Results 106 comments of


PSeitz

Inconsistent API implementation of BytesOptions, TextOptions, and NumericOptions

> I will fix the inconsistencies, once someone can confirm here Thanks for the links, yes they are inconsistent and should be fixed

Storing arbitrary attributes for segments

One thing that would be interesting for the quickwit use case is to store values in fields per segment if you have a limited number of values. This information can...

Possible to save 16 bytes on linear block metadata in the multilinear interpolation

> data_start_offset can be deduced by summing num_bits * block_size for each block before the block where we want to read. we can do it when we open the fast...

Add a GCD aware fastfield compression format.

Closed via https://github.com/quickwit-oss/tantivy/pull/1418

Consider Capability Based API

Yes, every change in the format should be expressed as a capability. Capabilities can also cover other parts, like search features. I think it will make consumption easier, but not...

Consider Capability Based API

I think if we add this early we can ease out some quirks until it gets serious, and it may help with experimenting without breaking everything

Choosing fastfield encoders

With https://github.com/tantivy-search/tantivy/pull/1072 every codec returns an compression estimation. Reality Check and setting options for trade-offs are still to do

Fast field on string fields

# API STRING is taking the untokenized content of a field value ("raw" tokenizer), e.g. "Cool Nice" -> "Cool Nice" TEXT is tokenizing the content of a field value with...

Fast field on string fields

> > Considering our docvalue codec is already dynamic couldn't we remove the extra cost? If a fastfield is multivalued according to the schema but only contains single valued, it...

Handling null values in fastfield for better use in aggregation

I think we should change the fast_field API from `fn get(&self, doc: DocId) -> Item;` to `fn get(&self, doc: DocId) -> Option;`