PSeitz
PSeitz
includes cardinality aggregation and term aggregation perf improvement for large "size" parameters The store size increase in the gh test is coming from the additional optional index in the multivalue...
When trying to quit quickwit, it occasionally hangs at decommissioning ingester. Only `kill -9` works ``` ^C2024-06-03T09:35:15.935Z INFO quickwit_ingest::ingest_v2::ingester: decommissioning ingester ^C ``` I couldn't reproduce it, seems to happen...
add logic to detect which splits will deliver the top n results for requests. This is only supported for match_all requests, with optional sort_by on timestamp sorting. The change extends...
- **update tantivy** - **use serde_json_borrow instead of serde_json::Value** ``` ➜ quickwit-indices cat mezmo/mezmo-use-stage-2023-01-20-ndjson/lines.njson | quickwit tool local-ingest --index mezmo serde_json_borrow + CompactDoc Num docs 46475 Parse errs 0 PublSplits...
`term_aggregation_high_card_top_100` between `7fb3aa54` and `b97f398c` Around 6% more CPU and 20% more latency
split_cache.max_file_descriptors is undocumented, but it is referenced in a validation that may fail ``` Caused by: max_num_concurrent_split_searches (1000) must be lower or equal to split_cache.max_file_descriptors (100) ```
add a test to `rest-api-tests` and document
The `calendar_interval` parameter is not supported currently in the date histogram aggregation, this is an outline on its challenges and drafting solutions. Unlike `fixed_interval`, `calendar_interval` may have intervals of different...
We coerce numerical values into a common numerical column. That may affect the precision of the cardinality aggregation. E.g. ``` Segment 1 {"val": 10} {"val": 5.5} => f64 Column Segment...
The way we handle precision truncation of `DateTime` when querying is opaque and error-prone. When using a date in the inverted index we want to truncate it to seconds, since...