OpenSearch
OpenSearch copied to clipboard
[BUG] The aggs result of NestedAggregator with sub NestedAggregator may be not accurately
Describe the bug
the result of NestedAggregator with sub NestedAggregator is not accurately here, the two values of doc_count should be 4.
Related component
Search:Aggregations
To Reproduce
- create the index.
PUT index1_nest111
{
"settings": {
"index.refresh_interval":"30s"
},
"mappings": {
"properties": {
"nested1": {
"type": "nested",
"properties": {
"name": {
"type": "keyword"
}
}
},
"nested2": {
"type": "nested",
"properties": {
"age": {
"type": "long"
}
}
}
}
}
}
- put the data. the 4 documents are same, except for the _id:
POST _bulk?refresh=true
{ "index": { "_index": "index1_nest111", "_id": "1" } }
{ "nested2": {"age":1}, "nested1": {"name": "name1"} }
{ "index": { "_index": "index1_nest111", "_id": "2" } }
{ "nested2": {"age":1}, "nested1": {"name": "name1"} }
POST _bulk?refresh=true
{ "index": { "_index": "index1_nest111", "_id": "3" } }
{ "nested2": {"age":1}, "nested1": {"name": "name1"} }
{ "index": { "_index": "index1_nest111", "_id": "4" } }
{ "nested2": {"age":1}, "nested1": {"name": "name1"} }
- aggregation
POST index1_nest111/_search
{
"aggregations": {
"out_nested": {
"aggregations": {
"out_terms": {
"aggregations": {
"inner_nested": {
"aggregations": {
"inner_terms": {
"terms": {
"field": "nested1.name"
}
}
},
"nested": {
"path": "nested1"
}
}
},
"terms": {
"field": "nested2.age"
}
}
},
"nested": {
"path": "nested2"
}
}
},
"size": 0
}
Expected behavior
The inner_nested.doc_count shouble alse be 4.
If it's a bug, I'm please to fix.
Additional Details
Host/Environment (please complete the following information):
- OS: os2.9
Nest2 child is outer nested aggregation, nest1 child is inner nested aggregation.
To help explain the describe above:
When execute the inner nested aggregation, the parentDoc=0(the first lucene document id) will be discarded https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/search/aggregations/bucket/nested/NestedAggregator.java#L196
We can see that parentDoc will not be always bigger than childDoc, which means that the function logic processBufferedChildBuckets is wrong, it will aggregate unrelated document.