Format all timestamps in query timezone
Is your feature request related to a problem? Please describe.
Results with timestamps are inconsistently converted to the query timezone depending on if it is used as a dimension, a time dimension, or a time dimension with granularity.
This makes presenting time data error-prone because it's unclear what timezone the data is formatted in since none of the results contain offsets.
Describe the solution you'd like
Always format timestamps in the query timezone to reduce timezone pitfalls.
Describe alternatives you've considered
- Another option would be to include the offset (or
Z) when the time differs from the query timezone, making it unambiguous. - Using SQL_UTILS.convertTz is a workaround, but requires defining additional dimensions and introduces pitfalls such as accidentally applying the offset twice in time dimensions:
Dimensions that use SQL_UTILS.convertTz() should not be used as timeDimensions in queries. Doing so will apply the conversion multiple times and yield wrong results.
Additional context
Example Schema for BigQuery:
cube('Test', {
sql: `
SELECT TIMESTAMP('2021-06-15T00:00:00', 'America/Los_Angeles') AS Created,
TIMESTAMP('2021-06-15T00:00:00', 'America/Los_Angeles') AS Updated
`,
dimensions: {
created: {
sql: 'Created',
type: 'time',
},
updated: {
sql: 'Updated',
type: 'time',
},
},
});
Query:
{
"timeDimensions": [
{
"dimension": "Test.created",
"dateRange": [
"2021-06-15",
"2021-06-15"
],
"granularity": "second"
}
],
"dimensions": [
"Test.updated"
],
"timezone": "America/Los_Angeles"
}
Results:
[
{
// These are in the query timezone
"Test.created.second": "2021-06-15T00:00:00.000",
"Test.created": "2021-06-15T00:00:00.000"
// This one is not in the query timezone
// and it's ambiguous which timezone it is in.
// Expected: "2021-06-15T00:00:00.000" or "2021-06-15T07:00:00.000Z"
"Test.updated": "2021-06-15T07:00:00.000",
}
]
Implementation:
A potential path could be to convert in the API Gateway as part of transformValue, but a better implementation could be in the drivers.
diff --git a/packages/cubejs-api-gateway/src/gateway.ts b/packages/cubejs-api-gateway/src/gateway.ts
index c4a00ab6..17901314 100644
--- a/packages/cubejs-api-gateway/src/gateway.ts
+++ b/packages/cubejs-api-gateway/src/gateway.ts
@@ -96,9 +96,9 @@ const prepareAnnotation = (metaConfig: MetaConfig[], query: any) => {
};
};
-const transformValue = (value, type) => {
+const transformValue = (value, type, timezone) => {
if (value && (type === 'time' || value instanceof Date)) { // TODO support for max time
- return (value instanceof Date ? moment(value) : moment.utc(value)).format(moment.HTML5_FMT.DATETIME_LOCAL_MS);
+ return (value instanceof Date ? moment(value) : moment.tz(value, timezone)).format(moment.HTML5_FMT.DATETIME_LOCAL_MS);
}
return value && value.value ? value.value : value; // TODO move to sql adapter
};
@@ -117,7 +117,7 @@ const transformData = (aliasToMemberNameMap, annotation, data, query, queryType)
const transformResult = [
memberName,
- transformValue(p[1], annotationForMember.type)
+ transformValue(p[1], annotationForMember.type, query.timezone)
];
const path = memberName.split('.');
Related:
- https://github.com/cube-js/cube.js/issues/443
- https://github.com/cube-js/cube.js/issues/689
- https://cube-js.slack.com/archives/CC0403RRR/p1612912462478500?thread_ts=1612839814.416800&cid=CC0403RRR
Just wanted to second this issue. Any date dimensions should be converted to the given timezone. For instance, I have a query where I pull orders by the order date but also display the date the customer was created at. The order date converts to the correct timezone but the customer create at date is still in UTC (the timezone it's stored as in the database)
We're also seeing this issue.
this is still not resolved with latest version
@paveltiunov, @hassankhan Any update to this issue? it would really helped out
Problem
I am applying "timezone": "Europe/Prague" in my query.
In myTable there is duration of some state in one hour
INSERT INTO MyTable (id, timestamp_column, duration_column)
VALUES
(1, '2021-11-30T00:00:00Z',3600),
(2, '2021-11-30T01:00:00Z', 3600),
(3, '2021-11-30T02:00:00Z', 3600),
...
(24, '2021-12-01T00:00:00Z', 3600),
...
(48, '2021-12-02T00:00:00Z', 3600),
...
(72, '2021-12-03T00:00:00Z', 3600),
...
(96, '2021-12-04T00:00:00Z', 3600),
...
(120, '2021-12-05T00:00:00Z', 3600);
And I need to get aggregated values with day granularity Cube.js Schema
Here is a simplified version of my Cube.js schema for reference:
cube(`Data`, {
sql: `SELECT * FROM MyTable`,
dimensions: {
from: {
sql: `${CUBE}.timestamp_column`,
type: `time`
},
},
measures: {
totalTime: {
type: `sum`,
sql: `${CUBE}.duration_column`
},
}
});
Here's an example of a query I'm running:
{
"query": {
"measures": ["Data.totalTime"],
"timeDimensions": [{
"dimension": "Data.from",
"granularity": "day",
"dateRange": ["2021-12-01","2021-12-04"]
}],
"segments": ["Data.canSegment"],
"timezone": "Europe/Prague"
}
}
Expected behavior
I expect the UTC to be shifted to "Europe/Prague" TZ and data are aggregated according to this TZ.
data_day_from (in EuropeTZ) totalTime_inSeconds
2021-12-01 00:00:00.000 86400
2021-12-02 00:00:00.000 86400
2021-12-03 00:00:00.000 86400
2021-12-04 00:00:00.000 86400
Actual behavior
The query returns the following results:
data_day_from (in UTC) totalTime_inSeconds
2021-11-30 00:00:00.000 3600
2021-12-01 00:00:00.000 86400
2021-12-02 00:00:00.000 86400
2021-12-03 00:00:00.000 86400
2021-12-04 00:00:00.000 82800
The data_day_from values are not adjusted to the Europe/Prague timezone as expected, and the totalTime_inSeconds values are not consistently 86400 as they should be. There are leftovers on each side, so I guess I have no way calculate day aggregation in EuropeTZ
Thanks for the reply
I also tried to use
- CUBEJS_SCHEDULED_REFRESH_TIMEZONES=Europe/Prague in docker-compose.yml
version: '2.2'
services:
cube:
image: cubejs/cube:v0.32.14
ports:
- 4001:4000
- 15432:15432
environment:
- CUBEJS_DEV_MODE=true
- CUBEJS_SCHEDULED_REFRESH_TIMEZONES=Europe/Prague
volumes:
- .:/cube/conf
and added preaggregation
preAggregations: {
main: {
measures: [Data.totalTime],
dimensions: [DimAttribute.AssetId],
timeDimension: Data.from,
granularity: `hour`,
}
but no effect whatsoever
I am not sure if I am doing this properly, the docs are not clear to me in this matter @igorlukanin :)
@prochj51 I'm having same issue just as you describe, here an example of query
{"measures":["myCube.lostOpportunities"],"timeDimensions":[{"dimension":"myCube.created","granularity":"hour","dateRange":["2023-07-04 00:00:00.000Z","2023-07-04 23:59:59.000Z"]}],"filters":[],"timezone":"Europe/Madrid"}
Response example:
{ "queryType": "regularQuery", "results": [ { "query": { "measures": [ "myCube.lostOpportunities" ], "timeDimensions": [ { "dimension": "myCube.created", "granularity": "hour", "dateRange": [ "2023-07-04T00:00:00.000", "2023-07-04T23:59:59.000" ] } ], "filters": [], "timezone": "Europe/Madrid", "limit": 10000, "order": [], "dimensions": [], "rowLimit": 10000, "queryType": "regularQuery" }, "data": [ { "myCube.created.hour": "2023-07-03T22:00:00.000", "myCube.created": "2023-07-03T22:00:00.000", "myCube.lostOpportunities": 0 }, { "myCube.created.hour": "2023-07-04T09:00:00.000", "myCube.created": "2023-07-04T09:00:00.000", "myCube.lostOpportunities": 2 }, { "myCube.created.hour": "2023-07-04T10:00:00.000", "myCube.created": "2023-07-04T10:00:00.000", "myCube.lostOpportunities": 4 }, { "myCube.created.hour": "2023-07-04T11:00:00.000", "myCube.created": "2023-07-04T11:00:00.000", "myCube.lostOpportunities": 1 }, { "myCube.created.hour": "2023-07-04T12:00:00.000", "myCube.created": "2023-07-04T12:00:00.000", "myCube.lostOpportunities": 1 }, { "myCube.created.hour": "2023-07-04T13:00:00.000", "myCube.created": "2023-07-04T13:00:00.000", "myCube.lostOpportunities": 0 }, { "myCube.created.hour": "2023-07-04T15:00:00.000", "myCube.created": "2023-07-04T15:00:00.000", "myCube.lostOpportunities": 1 }, { "myCube.created.hour": "2023-07-04T17:00:00.000", "myCube.created": "2023-07-04T17:00:00.000", "myCube.lostOpportunities": 0 }, { "myCube.created.hour": "2023-07-04T18:00:00.000", "myCube.created": "2023-07-04T18:00:00.000", "myCube.lostOpportunities": 0 }, { "myCube.created.hour": "2023-07-04T19:00:00.000", "myCube.created": "2023-07-04T19:00:00.000", "myCube.lostOpportunities": 0 } ], "lastRefreshTime": "2023-07-05T16:06:23.735Z", "transformedQuery": { "sortedDimensions": [], "sortedTimeDimensions": [ [ "myCube.created", "second" ] ], "timeDimensions": [ [ "myCube.created", "hour" ] ], "measures": [ "myCube.lostOpportunities" ], "leafMeasureAdditive": true, "leafMeasures": [ "myCube.lostOpportunities" ], "measureToLeafMeasures": { "myCube.lostOpportunities": [ { "measure": "myCube.lostOpportunities", "additive": true, "type": "count" } ] }, "hasNoTimeDimensionsWithoutGranularity": true, "allFiltersWithinSelectedDimensions": true, "isAdditive": true, "granularityHierarchies": { "year": [ "year", "quarter", "month", "month", "day", "hour", "minute", "second" ], "quarter": [ "quarter", "month", "day", "hour", "minute", "second" ], "month": [ "month", "day", "hour", "minute", "second" ], "week": [ "week", "day", "hour", "minute", "second" ], "day": [ "day", "hour", "minute", "second" ], "hour": [ "hour", "minute", "second" ], "minute": [ "minute", "second" ], "second": [ "second" ] }, "hasMultipliedMeasures": false, "hasCumulativeMeasures": false, "windowGranularity": null, "filterDimensionsSingleValueEqual": {}, "ownedDimensions": [], "ownedTimeDimensionsWithRollupGranularity": [ [ "myCube.created", "second" ] ], "ownedTimeDimensionsAsIs": [ [ "myCube.created", "hour" ] ] }, "requestId": "157d76af-f716-4fc2-8357-30fde05061c6-span-1", "annotation": { "measures": { "myCube.lostOpportunities": { "title": "My cube Lost Opportunities", "shortTitle": "Lost Opportunities", "type": "number", "drillMembers": [], "drillMembersGrouped": { "measures": [], "dimensions": [] } } }, "dimensions": {}, "segments": {}, "timeDimensions": { "myCube.created.hour": { "title": "my Cube Created", "shortTitle": "Created", "type": "time" }, "myCube.created": { "title": "my Cube Created", "shortTitle": "Created", "type": "time" } } }, "slowQuery": true, "total": null } ], "pivotQuery": { "measures": [ "myCube.lostOpportunities" ], "timeDimensions": [ { "dimension": "myCube.created", "granularity": "hour", "dateRange": [ "2023-07-04T00:00:00.000", "2023-07-04T23:59:59.000" ] } ], "filters": [], "timezone": "Europe/Madrid", "limit": 10000, "order": [], "dimensions": [], "rowLimit": 10000, "queryType": "regularQuery" }, "slowQuery": false }
My bigger problem is not when the granularity is hour because I can manage it in front but for day granularity the result count entries of a day in previous day(depending of timezone) and there is no way of manage in front.
For those struggling with this, make sure you're setting process.env.TZ to be "Etc/UTC", per https://github.com/cube-js/cube/issues/4171#issuecomment-1079068709.