[BUG] `fields` is changing the outcome of the following commands that depend on selected fields
What is the bug?
When running stats count() operation on a nested parameter the result seems to just return 0. If the field is specified before running the count, it will show the correct value.
How can one reproduce the bug? Steps to reproduce the behavior:
- Go to 'OpenSearch Playground:Query Workbench https://playground.opensearch.org/app/opensearch-query-workbench#/
- Click on PPL and enter the PPL "source = jaeger-span-* |stats count (references.refType)" and hit Run
-Output will be 0
Enter the PPL "source = jaeger-span-* | fields references.refType|stats count (references.refType)" and hit Run
-Output will be 193
What is the expected behavior?
The stats count() should return a consistent value regardless of any preceeding field selection.
What is your host/environment?
- OS: [e.g. iOS]
- Version [e.g. 22]
- Plugins
Do you have any screenshots? If applicable, add screenshots to help explain your problem.
Do you have any additional context? Add any other context about the problem.
This may be not a bug of SQL plugin. Regarding to nested field, OpenSearch DSL query will return 0 value too. Try below DSL
GET jaeger-span-*/_search
{
"size": 0,
"aggs": {
"count_refType": {
"value_count": {
"field": "references.refType"
}
}
}
}
PPL source = jaeger-span-* | stats count (references.refType) equals to above DSL query. But source = jaeger-span-* | fields references.refType | stats count (references.refType) equals searching keyword patients.name first then aggregate.
Open https://github.com/opensearch-project/OpenSearch/issues/14347 to track
Is this fixed with the related OpenSearch issue @LantaoJin?
Is this fixed with the related OpenSearch issue @LantaoJin?
No. It's not an OpenSearch Core bug. The aggregation query on a nested field should build Nested Aggregations builder, but in our v2 engine, the information of nested field is missing during expression analysis. We mistakenly use an normal aggregation builder to query on a nested field, rather than nested aggregation builder. I'm investigating how to store the nested field information in expression analyzing and trigger nested aggregation builder in v2.
#2814 is stalled, I am going to raise another PR to fix this issue for PPL first.