quickwit icon indicating copy to clipboard operation
quickwit copied to clipboard

Allow value_count to support text fields

Open katlim-br opened this issue 1 year ago • 1 comments

Is your feature request related to a problem? Please describe. We want to obtain the number of field occurrences for a query search.

For example:

If I send the following query to filter the results and also get aggregations for them (how many times the field is present in the results)

{
    "query": "xyz",
    "max_hits": 3,
    "aggs": {
        "hostname": {
            "value_count": { "field": "hostname" }
        },
        "memory": {
            "value_count": { "field": "memory" }
        }
    }
}

The current result is:

{
    "data": {
        "num_hits": 132268,
        "hits": [
            {
                "hostname": "pc1",
                "memory": 4294967296,
            },
            {
                "hostname": "pc2",
                "memory": 4294967296,
            },
            {
                "hostname": "pc3",
                "memory": 4294967296,
            },
        ],
        "aggregations": {
            "hostname": {
                "value": 0
            },
            "memory": {
                "value": 4234 (whatever value)
            }
        }
    }
}

"memory" works because it is a number. "hostname" doesn't work because it is a text field.

Describe the solution you'd like The result should be

        "aggregations": {
            "hostname": {
                "value": 53454
            },
            "memory": {
                "value": 4234
            }
        }

Where hostname can be counted on.

Describe alternatives you've considered Create an extra "fields" field with the list of fields in the object, and then run a "terms" aggregation query. This probably works, but it will increase the network traffic through our Kafka pipeline and the index sizes.

Additional context We need the count to be for all values even if repeated. The aggregation counts must correspond to the query provided.

katlim-br avatar Nov 28 '24 16:11 katlim-br

The fix in Tantivy: https://github.com/quickwit-oss/tantivy/pull/2547

rdettai avatar Nov 28 '24 20:11 rdettai