PSeitz

Results 106 comments of PSeitz

> I will fix the inconsistencies, once someone can confirm here Thanks for the links, yes they are inconsistent and should be fixed

One thing that would be interesting for the quickwit use case is to store values in fields per segment if you have a limited number of values. This information can...

> data_start_offset can be deduced by summing num_bits * block_size for each block before the block where we want to read. we can do it when we open the fast...

Closed via https://github.com/quickwit-oss/tantivy/pull/1418

Yes, every change in the format should be expressed as a capability. Capabilities can also cover other parts, like search features. I think it will make consumption easier, but not...

I think if we add this early we can ease out some quirks until it gets serious, and it may help with experimenting without breaking everything

With https://github.com/tantivy-search/tantivy/pull/1072 every codec returns an compression estimation. Reality Check and setting options for trade-offs are still to do

# API STRING is taking the untokenized content of a field value ("raw" tokenizer), e.g. "Cool Nice" -> "Cool Nice" TEXT is tokenizing the content of a field value with...

> > Considering our docvalue codec is already dynamic couldn't we remove the extra cost? If a fastfield is multivalued according to the schema but only contains single valued, it...

I think we should change the fast_field API from `fn get(&self, doc: DocId) -> Item;` to `fn get(&self, doc: DocId) -> Option;`