Inconsistencies with ALIGN TO CALENDAR TIME ZONE
To reproduce
Its unclear if TIME ZONE is working as expected with SAMPLE BY, especially when it comes to DST adjustments.
SAMPLE BY operates on UTC timestamps, and will return these. In order to display a timestamp in the desired time zone, you have to manually use to_timezone() in the select query, which will be correct anyway because it is DST aware. Therefore, specifying TIME ZONE doesn't seem to make much difference.
The expectations of SAMPLE BY behaviour may need to be reviewed and some work done to improve consistency in results. Its not clear here what TIME ZONE is supposed to do with the current SAMPLE BY behaviour.
DDL:
create table fill_options(ts timestamp, price int) timestamp(ts);
insert into fill_options (ts, price) values
(to_timestamp('2021-03-28:00:00:00', 'yyyy-MM-dd:HH:mm:ss'), 0),
(to_timestamp('2021-03-28:01:00:00', 'yyyy-MM-dd:HH:mm:ss'), 1),
(to_timestamp('2021-03-28:02:00:00', 'yyyy-MM-dd:HH:mm:ss'), 2),
(to_timestamp('2021-03-28:03:00:00', 'yyyy-MM-dd:HH:mm:ss'), 3),
(to_timestamp('2021-03-28:04:00:00', 'yyyy-MM-dd:HH:mm:ss'), 5)
The following two queries give identical results:
select ts, min(price) min, max(price) max, avg(price) avg, stddev_samp(price) stddev_samp
from fill_options
sample by 30m
fill(prev)
select ts, min(price) min, max(price) max, avg(price) avg, stddev_samp(price) stddev_samp
from fill_options
sample by 30m
fill(prev) align to calendar time zone 'Europe/Berlin'
The updated FILL(LINEAR) now supports ALIGN TO CALENDAR, but has no DST code. However, it still works like FILL(PREV):
In the database tests, an extra field with a converted timestamp is used to show the DST values:
select to_timezone(ts, 'Europe/Berlin') k, ts, min(price) min, max(price) max, avg(price) avg, stddev_samp(price) stddev_samp
from fill_options
sample by 30m
fill(prev)
align to calendar time zone 'Europe/Berlin'
group by ts
But if you remove group by ts, you get this:
Likewise, an example with 1d sampling and OFFSET. The expectation might be that you'd get one record per day, always in Berlin time and offset to 2 AM. But you don't:
select to_timezone(ts, 'Europe/Berlin') k, ts, min(price) min, max(price) max, avg(price) avg, stddev_samp(price) stddev_samp
from fill_options
sample by 1d
fill(prev)
align to calendar time zone 'Europe/Berlin' with offset '02:00';
If we go back a plainer example and remove the fill, we end up in the same place again:
select to_timezone(ts, 'Europe/Berlin') k, ts, min(price) min, max(price) max, avg(price) avg, stddev_samp(price) stddev_samp
from fill_options
sample by 30m
align to calendar time zone 'Europe/Berlin'
QuestDB version:
7.4.0/7.4.1
OS, in case of Docker specify Docker and the Host OS:
Windows
File System, in case of Docker specify Host File System:
NTFS
Full Name:
Nick Woolmer
Affiliation:
QuestDB
Have you followed Linux, MacOs kernel configuration steps to increase Maximum open files and Maximum virtual memory areas limit?
- [X] Yes, I have
Additional context
No response
The PR does not resolve this issue.