druid icon indicating copy to clipboard operation
druid copied to clipboard

Group By query on Theta sketches breaks in version 25.

Open shubhraGupta opened this issue 2 years ago • 10 comments

We did rolling upgrade on our Staging Cluster from Version 22 and version 25 and saw one of the Theta sketch queries are breaking in version 25.

Affected Version

Version 25

Description

Below query is breaking in version 25.

with t1(merge_column, s1) as
    (
      SELECT 0 as merge_column,
             DS_THETA(c1)
        FROM table1
    )


      SELECT 
        gp_col1,
        sum(ts2),
        sum(ts1),
        sum(ts3)
      FROM (
               SELECT gp_col1,
                      gp_multiplier,
                      THETA_SKETCH_ESTIMATE((DS_THETA(s2))) as ts2,
                      THETA_SKETCH_ESTIMATE((DS_THETA(s1))) as ts1,
                      gp_multiplier * THETA_SKETCH_ESTIMATE(THETA_SKETCH_INTERSECT(DS_THETA(s2), DS_THETA(s1))) as ts3
                 FROM (
                        SELECT 0 as merge_column,
                               gp_col1,
                               gp_multiplier,
                               DS_THETA(c2) as s2
                          FROM table2
                         WHERE gp_multiplier != 0 
                         GROUP BY gp_col1,
                      gp_multiplier
                      ) t2
                 JOIN t1
                   ON t1.merge_column = t2.merge_column
                GROUP BY gp_col1,
                      gp_multiplier
            )
      GROUP BY gp_col1

In version 22, we get non zero values for sum(ts2), sum(ts1) and sum(ts3). In version 25, we are getting non zero value for only for sum(ts2), others sums are returning 0. Though ts2, ts1 and ts3 in subquery are returning correct values.

We checked the Query Plan in v22 and v25 and find out query plan has changed in version 25. The same query comes under type 'TopN' in version 22 and is of type GroupBy in version 25.

Explain Query in Version 22

DruidOuterQueryRel(query=[{"queryType":"topN","dataSource":
  {"type":"table","name":"__subquery__"},
  "virtualColumns":[],
  "dimension":{"type":"default","dimension":"gp_col1","outputName":"d0","outputType":"STRING"},
  "metric":{"type":"dimension","previousStop":null,"ordering":{"type":"lexicographic"}},
  "threshold":101,"intervals":{"type":"intervals",
  "intervals":["-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"]},
  "filter":null,
  "granularity":{"type":"all"},
  "aggregations":[{"type":"doubleSum","name":"a0","fieldName":"ts2","expression":null},
    {"type":"doubleSum","name":"a1","fieldName":"ts1","expression":null},
    {"type":"doubleSum","name":"a2","fieldName":"ts3","expression":null}],
  "postAggregations":[],"context":{"sqlOuterLimit":101,"sqlQueryId":"ba9dcb86-9f4f-4c01-ab04-569ffd1cec08"},
"descending":false}],
 signature=[{d0:STRING, a0:DOUBLE, a1:DOUBLE, a2:DOUBLE}]) 
DruidJoinQueryRel(condition=[=($0, $4)], joinType=[inner], 
  query=[{"queryType":"groupBy",
    "dataSource":{
    "type":"table","name":"__join__"},
    "intervals":{
    "type":"intervals","intervals":["-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"]},
    "virtualColumns":[],"filter":null,"granularity":{"type":"all"},
    "dimensions":
    [{"type":"default","dimension":"gp_col1","outputName":"d0","outputType":"STRING"},
      {"type":"default","dimension":"gp_multiplier","outputName":"d1","outputType":"STRING"}],
    "aggregations":[
      {"type":"thetaSketch","name":"a0","fieldName":"s2","size":16384,"shouldFinalize":true,"isInputThetaSketch":false,"errorBoundsStdDev":null},
  {"type":"thetaSketch","name":"a1","fieldName":"EXPR$1","size":16384,
  "shouldFinalize":true,"isInputThetaSketch":false,"errorBoundsStdDev":null}],
    "postAggregations":[{"type":"thetaSketchEstimate","name":"p1",
      "field":{
      "type":"fieldAccess","name":"p0","fieldName":"a0"},"errorBoundsStdDev":null},
      {"type":"thetaSketchEstimate","name":"p3",
        "field":{"type":"fieldAccess","name":"p2","fieldName":"a1"},"errorBoundsStdDev":null},
        {"type":"thetaSketchEstimate","name":"p7",
          "field":{"type":"thetaSketchSetOp","name":"p6","func":"INTERSECT","size":16384,
            "fields":[{"type":"fieldAccess","name":"p4","fieldName":"a0"},
              {"type":"fieldAccess","name":"p5","fieldName":"a1"}]},"errorBoundsStdDev":null},
            {"type":"expression","name":"p8","expression":"(CAST(\"d1\", 'DOUBLE') * p7)","ordering":null}
          ],"having":null,"limitSpec":{"type":"NoopLimitSpec"},
    "context":{"sqlOuterLimit":101,"sqlQueryId":"ba9dcb86-9f4f-4c01-ab04-569ffd1cec08"},"descending":false}], 
  signature=[{d0:STRING, p1:DOUBLE, p3:DOUBLE, p8:DOUBLE}]) 
DruidQueryRel(query=[{"queryType":"groupBy","dataSource":{"type":"table","name":"table2"},
  "intervals":{"type":"intervals","intervals":["2023-03-01T00:00:00.000Z/2023-03-01T00:00:00.001Z"]},
  "virtualColumns":[],"filter":{
    "type":"and","fields":[{
      "type":"not","field":{
        "type":"bound","dimension":"gp_multiplier","lower":"0","upper":"0","lowerStrict":false,"upperStrict":false,"extractionFn":null,"ordering":{
        "type":"numeric"}}},{
          "type":"not","field":{
          "type":"selector","dimension":"gp_col1","value":null,"extractionFn":null}}]},
    "granularity":{"type":"all"},
    "dimensions":[{"type":"default","dimension":"gp_col1","outputName":"d0","outputType":"STRING"},
      {"type":"default","dimension":"gp_multiplier","outputName":"d1","outputType":"STRING"}],
    "aggregations":[{"type":"thetaSketch","name":"a0","fieldName":"c2","size":16384,"shouldFinalize":true,"isInputThetaSketch":false,"errorBoundsStdDev":null}],
    "postAggregations":[{
    "type":"expression","name":"p0","expression":"0","ordering":null},{
    "type":"fieldAccess","name":"p1","fieldName":"a0"}],"having":null,
    "limitSpec":{"type":"NoopLimitSpec"},"context":{
    "sqlOuterLimit":101,"sqlQueryId":"ba9dcb86-9f4f-4c01-ab04-569ffd1cec08"},"descending":false}],
     signature=[{p0:LONG, d0:STRING, d1:STRING, p1:COMPLEX}]) 
DruidQueryRel(query=[{"queryType":"timeseries","dataSource":{
"type":"table","name":"table1"},
"intervals":{
"type":"intervals",
"intervals":["2023-03-15T00:00:00.000Z/2023-03-16T00:00:00.000Z"]},
"descending":false,"virtualColumns":[],"filter":null,"granularity":{"type":"all"},
"aggregations":[{
"type":"thetaSketch","name":"a0","fieldName":"c1","size":16384,"shouldFinalize":true,"isInputThetaSketch":false,"errorBoundsStdDev":null}],
"postAggregations":[{"type":"expression","name":"p0","expression":"0","ordering":null},
  {"type":"fieldAccess","name":"p1","fieldName":"a0"}],"limit":2147483647,"context":{
"sqlOuterLimit":101,"sqlQueryId":"ba9dcb86-9f4f-4c01-ab04-569ffd1cec08"}}], signature=[{p0:LONG, p1:COMPLEX}])

Explain Query in version 25

{
  "queryType": "groupBy",
  "dataSource": {
    "type": "query",
    "query": {
      "queryType": "groupBy",
      "dataSource": {
        "type": "join",
        "left": {
          "type": "query",
          "query": {
            "queryType": "groupBy",
            "dataSource": {
              "type": "table",
              "name": "table2"
            },
            "intervals": {
              "type": "intervals",
              "intervals": [
                "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
              ]
            },
            "filter": {
              "type": "and",
              "fields": [
                {
                  "type": "not",
                  "field": {
                    "type": "bound",
                    "dimension": "gp_multiplier",
                    "lower": "0",
                    "upper": "0",
                    "ordering": {
                      "type": "numeric"
                    }
                  }
                },
                {
                  "type": "not",
                  "field": {
                    "type": "selector",
                    "dimension": "gp_col1",
                    "value": null
                  }
                }
              ]
            },
            "granularity": {
              "type": "all"
            },
            "dimensions": [
              {
                "type": "default",
                "dimension": "gp_col1",
                "outputName": "d0",
                "outputType": "STRING"
              },
              {
                "type": "default",
                "dimension": "gp_multiplier",
                "outputName": "d1",
                "outputType": "STRING"
              }
            ],
            "aggregations": [
              {
                "type": "thetaSketch",
                "name": "a0",
                "fieldName": "c2",
                "size": 16384,
                "shouldFinalize": false
              }
            ],
            "postAggregations": [
              {
                "type": "expression",
                "name": "p0",
                "expression": "0"
              }
            ],
            "limitSpec": {
              "type": "NoopLimitSpec"
            },
            "context": {
              "queryId": "1b05327f-6904-4e6b-8876-225929d81ab9",
              "sqlOuterLimit": 1001,
              "sqlQueryId": "1b05327f-6904-4e6b-8876-225929d81ab9",
              "useNativeQueryExplain": true
            }
          }
        },
        "right": {
          "type": "query",
          "query": {
            "queryType": "timeseries",
            "dataSource": {
              "type": "table",
              "name": "table1"
            },
            "intervals": {
              "type": "intervals",
              "intervals": [
                "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
              ]
            },
            "granularity": {
              "type": "all"
            },
            "aggregations": [
              {
                "type": "thetaSketch",
                "name": "a0",
                "fieldName": "c1",
                "size": 16384,
                "shouldFinalize": false
              }
            ],
            "postAggregations": [
              {
                "type": "expression",
                "name": "p0",
                "expression": "0"
              }
            ],
            "context": {
              "queryId": "1b05327f-6904-4e6b-8876-225929d81ab9",
              "sqlOuterLimit": 1001,
              "sqlQueryId": "1b05327f-6904-4e6b-8876-225929d81ab9",
              "useNativeQueryExplain": true
            }
          }
        },
        "rightPrefix": "j0.",
        "condition": "(\"p0\" == \"j0.p0\")",
        "joinType": "INNER"
      },
      "intervals": {
        "type": "intervals",
        "intervals": [
          "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
        ]
      },
      "granularity": {
        "type": "all"
      },
      "dimensions": [
        {
          "type": "default",
          "dimension": "d0",
          "outputName": "_d0",
          "outputType": "STRING"
        },
        {
          "type": "default",
          "dimension": "d1",
          "outputName": "_d1",
          "outputType": "STRING"
        }
      ],
      "aggregations": [
        {
          "type": "thetaSketch",
          "name": "_a0:a",
          "fieldName": "a0",
          "size": 16384,
          "shouldFinalize": false
        },
        {
          "type": "thetaSketch",
          "name": "_a1:a",
          "fieldName": "j0.a0",
          "size": 16384,
          "shouldFinalize": false
        }
      ],
      "postAggregations": [
        {
          "type": "thetaSketchEstimate",
          "name": "p1",
          "field": {
            "type": "fieldAccess",
            "name": "p0",
            "fieldName": "_a0:a"
          }
        },
        {
          "type": "thetaSketchEstimate",
          "name": "p3",
          "field": {
            "type": "fieldAccess",
            "name": "p2",
            "fieldName": "_a1:a"
          }
        },
        {
          "type": "thetaSketchEstimate",
          "name": "p7",
          "field": {
            "type": "thetaSketchSetOp",
            "name": "p6",
            "func": "INTERSECT",
            "size": 16384,
            "fields": [
              {
                "type": "fieldAccess",
                "name": "p4",
                "fieldName": "_a0:a"
              },
              {
                "type": "fieldAccess",
                "name": "p5",
                "fieldName": "_a1:a"
              }
            ]
          }
        },
        {
          "type": "expression",
          "name": "p8",
          "expression": "(CAST(\"_d1\", 'DOUBLE') * \"p7\")"
        }
      ],
      "limitSpec": {
        "type": "NoopLimitSpec"
      },
      "context": {
        "queryId": "1b05327f-6904-4e6b-8876-225929d81ab9",
        "sqlOuterLimit": 1001,
        "sqlQueryId": "1b05327f-6904-4e6b-8876-225929d81ab9",
        "useNativeQueryExplain": true
      }
    }
  },
  "intervals": {
    "type": "intervals",
    "intervals": [
      "-146136543-09-08T08:23:32.096Z/146140482-04-24T15:36:27.903Z"
    ]
  },
  "granularity": {
    "type": "all"
  },
  "dimensions": [
    {
      "type": "default",
      "dimension": "_d0",
      "outputName": "d0",
      "outputType": "STRING"
    }
  ],
  "aggregations": [
    {
      "type": "doubleSum",
      "name": "a0",
      "fieldName": "p1"
    },
    {
      "type": "doubleSum",
      "name": "a1",
      "fieldName": "p3"
    },
    {
      "type": "doubleSum",
      "name": "a2",
      "fieldName": "p8"
    }
  ],
  "limitSpec": {
    "type": "default",
    "columns": [],
    "limit": 1001
  },
  "context": {
    "queryId": "1b05327f-6904-4e6b-8876-225929d81ab9",
    "sqlOuterLimit": 1001,
    "sqlQueryId": "1b05327f-6904-4e6b-8876-225929d81ab9",
    "useNativeQueryExplain": true
  }
}

shubhraGupta avatar May 22 '23 16:05 shubhraGupta

I'm wondering if you have a way to reproduce this on a fresh install? Like a dataset that can be loaded, where we can run this query on that dataset? It would help in debugging what is going on.

gianm avatar May 24 '23 01:05 gianm

@gianm I attached two parquet files and ingestion json, also screenshot showing the two engines and result. I got the bug replicated with the data set on a docker 25.0 image. 14326.zip

Instructions

  1. in a local directory, unzip the attachment
  2. start 25.0 docker image, add a volume like .:/data to all druid node
  3. cat ingest.json| curl -H "Content-Type: application/json" -X POST -d "$(</dev/stdin)" "http://localhost:8081/druid/indexer/v1/task" cat ingest2.json| curl -H "Content-Type: application/json" -X POST -d "$(</dev/stdin)" "http://localhost:8081/druid/indexer/v1/task"
  4. wait for completion, run the query in unified console with different engines

djiangc avatar May 24 '23 15:05 djiangc

To clarify, the sql-native engine has a breaking change, while the multi-stage engine produced the correct result. My guess is multi-stage materialized the intermediate result.

djiangc avatar May 24 '23 15:05 djiangc

@gianm were you able to duplicate the issue?

djiangc avatar May 25 '23 14:05 djiangc

@gianm did you get a chance to review this?

djiangc avatar May 30 '23 23:05 djiangc

@gianm did you get a chance to review this?

@djiangc I just fetched the zip file and loaded up the data to try to figure out what is going on here. I noticed the query from the original post doesn't run on these tables, possibly because the table and column names have been changed. Do you have a copy of the query with table and column names that match the ones in 14326.zip?

gianm avatar May 31 '23 00:05 gianm

@gianm sorry for the late response, please find the SQL below query.sql.gz

djiangc avatar Jun 01 '23 20:06 djiangc

Hi @djiangc— thank you for the details! I tried this on Druid 26 or latest master, and got the following results in both the native engine & MSQ. It looks like they match your screenshot. It looks like this issue has already been fixed in Druid 26!

image

gianm avatar Jun 07 '23 11:06 gianm

Thanks, confirmed fixed. many thanks, @gianm

djiangc avatar Jun 07 '23 13:06 djiangc

This issue has been marked as stale due to 280 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the [email protected] list. Thank you for your contributions.

github-actions[bot] avatar Mar 14 '24 00:03 github-actions[bot]

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

github-actions[bot] avatar Apr 11 '24 00:04 github-actions[bot]