questdb icon indicating copy to clipboard operation
questdb copied to clipboard

Inconsistencies with ALIGN TO CALENDAR TIME ZONE

Open nwoolmer opened this issue 1 year ago • 1 comments

To reproduce

Its unclear if TIME ZONE is working as expected with SAMPLE BY, especially when it comes to DST adjustments.

SAMPLE BY operates on UTC timestamps, and will return these. In order to display a timestamp in the desired time zone, you have to manually use to_timezone() in the select query, which will be correct anyway because it is DST aware. Therefore, specifying TIME ZONE doesn't seem to make much difference.

The expectations of SAMPLE BY behaviour may need to be reviewed and some work done to improve consistency in results. Its not clear here what TIME ZONE is supposed to do with the current SAMPLE BY behaviour.

DDL:

create table fill_options(ts timestamp, price int) timestamp(ts);

insert into fill_options (ts, price) values
(to_timestamp('2021-03-28:00:00:00', 'yyyy-MM-dd:HH:mm:ss'), 0),
(to_timestamp('2021-03-28:01:00:00', 'yyyy-MM-dd:HH:mm:ss'), 1),
(to_timestamp('2021-03-28:02:00:00', 'yyyy-MM-dd:HH:mm:ss'), 2),
(to_timestamp('2021-03-28:03:00:00', 'yyyy-MM-dd:HH:mm:ss'), 3),
(to_timestamp('2021-03-28:04:00:00', 'yyyy-MM-dd:HH:mm:ss'), 5)

The following two queries give identical results:

select ts, min(price) min, max(price) max, avg(price) avg, stddev_samp(price) stddev_samp
from fill_options
sample by 30m
fill(prev)

select ts, min(price) min, max(price) max, avg(price) avg, stddev_samp(price) stddev_samp
from fill_options
sample by 30m
fill(prev) align to calendar time zone 'Europe/Berlin'

image

The updated FILL(LINEAR) now supports ALIGN TO CALENDAR, but has no DST code. However, it still works like FILL(PREV):

image

In the database tests, an extra field with a converted timestamp is used to show the DST values:

select to_timezone(ts, 'Europe/Berlin') k, ts, min(price) min, max(price) max, avg(price) avg, stddev_samp(price) stddev_samp
from fill_options
sample by 30m
fill(prev)
align to calendar time zone 'Europe/Berlin'
group by ts

image

But if you remove group by ts, you get this:

image

Likewise, an example with 1d sampling and OFFSET. The expectation might be that you'd get one record per day, always in Berlin time and offset to 2 AM. But you don't:

select to_timezone(ts, 'Europe/Berlin') k, ts, min(price) min, max(price) max, avg(price) avg, stddev_samp(price) stddev_samp
from fill_options
sample by 1d 
fill(prev)
align to calendar time zone 'Europe/Berlin' with offset '02:00';

image

If we go back a plainer example and remove the fill, we end up in the same place again:

select to_timezone(ts, 'Europe/Berlin') k, ts, min(price) min, max(price) max, avg(price) avg, stddev_samp(price) stddev_samp
from fill_options
sample by 30m
align to calendar time zone 'Europe/Berlin'

image

QuestDB version:

7.4.0/7.4.1

OS, in case of Docker specify Docker and the Host OS:

Windows

File System, in case of Docker specify Host File System:

NTFS

Full Name:

Nick Woolmer

Affiliation:

QuestDB

Have you followed Linux, MacOs kernel configuration steps to increase Maximum open files and Maximum virtual memory areas limit?

  • [X] Yes, I have

Additional context

No response

nwoolmer avatar Apr 18 '24 11:04 nwoolmer

The PR does not resolve this issue.

nwoolmer avatar May 13 '24 12:05 nwoolmer