cube
cube copied to clipboard
Rollup joins across different data sources with different param allocators causing invalidations
Describe the bug When attempting to run the orchestration on a druid datasource to rebuild pre-aggregations. The following error is spit out in the API logs after running.
--- EDIT ---
Seems like this for me may be an underlying issue with something i'm working through. My API node is frequently assuming my druid models are my default
driver param object in more than just orchestration. It infact is invalidating my pre-aggregations because it's assuming the default
driver. I do not have a cube.js
file in my instance.
Slack thread:https://cube-js.slack.com/archives/C04NYBJP7RQ/p1715737464066809
(IF i'm just doing it wrong... let me know 😓 )
{
"action": "post",
"selector": {
"contexts": [{ "securityContext": {} }],
"timezones": ["UTC"],
"dataSources": ["druid"]
}
}
{"message":"Downloading external pre-aggregation via query error","queryKey":
[["CREATE TABLE cube_production_pre_aggregations.video_views_video_rollup_count20240415 AS SELECT\n \"video_views\".video_id \"video_views__video_id\", DATE_TRUNC('day', CAST(TIME_FORMAT(\"video_views\".__time, 'yyyy-MM-dd HH:mm:ss', 'UTC') AS TIMESTAMP)) \"video_views__time_day\", sum(\"video_views\".views) \"video_views__view_count\"\n FROM\n TeamVideoView AS \"video_views\" WHERE (\"video_views\".__time >= TIME_PARSE($1) AND \"video_views\".__time <= TIME_PARSE($2)) GROUP BY 1, 2",["2024-04-15T00:00:00.000Z","2024-04-21T23:59:59.999Z"],{}],[[{"refresh_key":null}]]],"query":"SELECT\n \"video_views\".video_id \"video_views__video_id\", DATE_TRUNC('day', CAST(TIME_FORMAT(\"video_views\".__time, 'yyyy-MM-dd HH:mm:ss', 'UTC') AS TIMESTAMP)) \"video_views__time_day\", sum(\"video_views\".views) \"video_views__view_count\"\n FROM\n TeamVideoView AS \"video_views\" WHERE (\"video_views\".__time >= TIME_PARSE($1) AND \"video_views\".__time <= TIME_PARSE($2)) GROUP BY 1, 2","values":["2024-04-15T00:00:00.000Z","2024-04-21T23:59:59.999Z"],"targetTableName":"cube_production_pre_aggregations.video_views_video_rollup_count20240415_lnicrtcl_403ejwb4_1j4a3mu","requestId":"6e558a4f-3de1-4e19-a350-fb41acb687a6","newVersionEntry":{"table_name":"cube_production_pre_aggregations.video_views_video_rollup_count20240415","structure_version":"403ejwb4","content_version":"lnicrtcl","last_updated_at":1715801822362,"naming_version":2},"buildRangeEnd":"2024-04-21T23:59:59.999","error":"Error: org.apache.calcite.runtime.CalciteContextException: From line 4, column 81 to line 4, column 82: Column '$1' not found in any table.__time >= TIME_PARSE($1) AND \"video_views\".__time <= TIME_PARSE($2)) GROUP BY 1, 2","values":["2024-04-15T00:00:00.000Z","2024-04-21T23:59:59.999Z"],"targetTableName":"cube_production_pre_aggregations.video_views_video_rollup_count20240415_lnicrtcl_403ejwb4_1j4a3mu","requestId":"6e558a4f-3de1-4e19-a350-fb41acb687a6","newVersionEntry":{"table_name":"cube_production_pre_aggregations.video_views_video_rollup_count20240415","structure_version":"403ejwb4","content_version":"lnicrtcl","last_updated_at":1715801822362,"naming_version":2},"buildRangeEnd":"2024-04-21T23:59:59.999","error":"Error: org.apache.calcite.runtime.CalciteContextException: From line 4, column 81 to line 4, column 82: Column '$1' not found in any table
Focusing on specifically Error: org.apache.calcite.runtime.CalciteContextException: From line 4, column 81 to line 4, column 82: Column '$1' not found in any table
at /cube/node_modules/@cubejs-backend/druid-driver/src/DruidClient.ts:92:19\n
at processTicksAndRejections (node:internal/process/task_queues:95:5)\n
at DruidDriver.downloadQueryResults (/cube/node_modules/@cubejs-backend/druid-driver/src/DruidDriver.ts:140:31)\n
at PreAggregationLoader.refreshReadOnlyExternalStrategy (/cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/PreAggregations.ts:1154:19)\n
at Object.query (/cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/QueryCache.ts:602:26)\n
at QueryQueue.processQuery (/cube/node_modules/@cubejs-backend/query-orchestrator/src/orchestrator/QueryQueue.js:842:25)"}
To Reproduce Steps to reproduce the behavior:
- setup druid datasource and another default (postgres / redshift in my case)
CUBEJS_DATASOURCES: "default,druid"
- setup pre-aggregation on druid cube with a refresh key as such
refreshKey: {
every: `1 hour`,
incremental: true
},
- hit the orchestration API with the above API call
Expected behavior
It seems that this process isn't utilizing the druid based ?
expected variable but rather the postgres driver param in my case (which is redshift / postgres $1
syntax )
IT would appear that this method is being called with a passed in $1
args
Version: Latest version