tantivy icon indicating copy to clipboard operation
tantivy copied to clipboard

Implement extended stats aggregation.

Open guilload opened this issue 2 years ago • 2 comments

Extended stats aggregation

Look into using the parallel version of Welford's online algorithm for computing the unbiased variance.

guilload avatar Jan 13 '23 23:01 guilload

Hi there, I'm interested in working on this issue and would appreciate some guidance on getting started. I have some experience with Rust and am familiar with the concept of extended stats aggregation. Please let me know how I can contribute to this project. Thanks!

ExpressGradient avatar Mar 25 '23 08:03 ExpressGradient

Hi,

all the aggregations in tantivy are in the aggregation folder: https://github.com/quickwit-oss/tantivy/tree/main/src/aggregation

The aggregation docs (latest release, not main branch. better build that locally with cargo doc) https://docs.rs/tantivy/latest/tantivy/aggregation/index.html

For the extended stats, it's probably a good idea to use the normal stats as a blueprint. There are also tests in that file. https://github.com/quickwit-oss/tantivy/blob/main/src/aggregation/metric/stats.rs

Connecting a new aggregation is pretty straightforward, just extend the request and result enums ... and then fix all the compile errors. https://github.com/quickwit-oss/tantivy/blob/main/src/aggregation/agg_req.rs#L229 https://github.com/quickwit-oss/tantivy/blob/main/src/aggregation/agg_result.rs#L84

Also feel free to drop by in discord to ask questions: https://discord.gg/MT27AG5EVE

PSeitz avatar Mar 25 '23 08:03 PSeitz