sql icon indicating copy to clipboard operation
sql copied to clipboard

[BUG] Incorrect value of quotted nested fields aggregations in JSON export

Open smortex opened this issue 1 year ago • 2 comments

What is the bug?

When quoting the field passed to an aggregation and when this field is a nested field, a SQL query with the JSON format return an incorrect value (0 or null) for the aggregation instead of the actual result.

How can one reproduce the bug?

Given an index with documents with a nested (field_nested.c2) and a non-nested field (field_float), using the following two queries to perform some basic aggregations:

  • SELECT sum(`field_float`) AS `SUM(field_float)`, sum(`field_nested.c2`) AS `SUM(field_nested.c2)` FROM data1 LIMIT 5000;
  • SELECT sum(field_float) AS `SUM(field_float)`, sum(field_nested.c2) AS `SUM(field_nested.c2)` FROM data1 LIMIT 5000;

We expect the same result. But when requesting JSON documents, the first query produce wrong values:

First query result:

{    
    "took": 1,    
    "timed_out": false,    
    "_shards": {    
        "total": 1,    
        "successful": 1,    
        "skipped": 0,    
        "failed": 0    
    },    
    "hits": {    
        "total": {    
            "value": 3,    
            "relation": "eq"    
        },    
        "max_score": null,    
        "hits": []    
    },    
    "aggregations": {    
        "SUM(field_nested.c2)": {    
            "value": 0    
        },    
        "SUM(field_float)": {    
            "value": 0.6000000163912773    
        }    
    }    
} 

Second query result:

{    
    "took": 1,    
    "timed_out": false,    
    "_shards": {    
        "total": 1,    
        "successful": 1,    
        "skipped": 0,    
        "failed": 0    
    },    
    "hits": {    
        "total": {    
            "value": 3,    
            "relation": "eq"    
        },    
        "max_score": null,    
        "hits": []    
    },    
    "aggregations": {    
        "SUM(field_nested.c2)": {    
            "value": 6    
        },    
        "SUM(field_float)": {    
            "value": 0.6000000163912773    
        }    
    }    
} 

Note how aggregations."SUM(field_nested.c2)".value is 0 in the first example and 6 it the second.

What is the expected behavior?

Quoted and non-quoted nested fields in aggregations should return the same (correct) value.

What is your host/environment?

  • OS: Debian
  • Version 11
  • OpenSearch 2.12.0

Do you have any additional context?

  • For the bug to happen, the field must be nested and quoted;
  • Only the JSON format is affected, the output of other formats has the expected value;
  • Some aggregations (e.g. sum as in the example above) always return 0, but other (e.g. avg) always return null instead of the actual value.

smortex avatar Feb 24 '24 02:02 smortex

In order to ease-up reproducing the issue, here is another example targeting the demo dataset, and that can be reproduced on https://playground.opensearch.org:

  1. Go to https://playground.opensearch.org
  2. Click on "Query Workbench"
  3. Copy & paste one of the queries bellow
  4. Click on "Run"
  5. See the expected values
  6. Click on "Download" / "Download JSON" to access the JSON result
  7. See the wrong values

Test queries:

  • Query producing incorrect results: select avg(`products.price`), min(`products.price`), max(`products.price`), avg(`total_quantity`) from opensearch_dashboards_sample_data_ecommerce;
  • Query producing the expected results: select avg(products.price), min(products.price), max(products.price), avg(total_quantity) from opensearch_dashboards_sample_data_ecommerce;

smortex avatar Feb 24 '24 02:02 smortex

Thanks for the issue!

Swiddis avatar Mar 27 '24 00:03 Swiddis