timescaledb icon indicating copy to clipboard operation
timescaledb copied to clipboard

[Bug]: Refreshing C-Agg with timezone (manually or by job) results in (ERROR: timestamp out of range) if start_offset is NULL. (also reproducible when start_offset and end_offset, both are null)

Open hardikm10 opened this issue 1 year ago • 3 comments

What type of bug is this?

Unexpected error

What subsystems and features are affected?

Continuous aggregate

What happened?

Refreshing C-Agg with timezone (manually or by job) results in (ERROR: timestamp out of range) if start_offset is NULL.

The reproducable instructions are attached below. few observations I wanted to add:

  1. It is also reproducible when start_offset and end_offset, both are null.
  2. If I insert any data (even a single row) into the hyertable before creating the C-Agg(with timezone), the error **isn't reproduced ** .
  3. If timezone isn't specified, it can't be reproduced as far as I can check

TimescaleDB version affected

2.10.1

PostgreSQL version used

PostgreSQL 15.2

What operating system did you use?

Timescale Cloud

What installation method did you use?

Other

What platform did you run on?

Timescale Cloud

Relevant log output and stack trace

CALL run_job(:new_job_id);
ERROR:  timestamp out of range

CALL refresh_continuous_aggregate('sensor_summary_daily', NULL, '2023-03-21 05:00:00+00' );
ERROR:  timestamp out of range

How can we reproduce the bug?

drop table sensor_data cascade;

create table sensor_data( 
time timestamptz not null, 
sensor_id integer not null, 
cpu double precision null,
temperature double precision null );

SELECT create_hypertable('sensor_data','time');

-- if the insert below is executed, the error is not reproducible
-- INSERT into sensor_data (time, sensor_id, cpu, temperature) VALUES (now(), 1, 1.0, 1.0);


--Create CA, with data OR with no data (doesnt matter, both runs into the same error)
CREATE MATERIALIZED VIEW sensor_summary_daily
    WITH (timescaledb.continuous) AS
        SELECT
            time_bucket('1 hour', time, 'AWST')       AS bucket,
            sensor_id,
            avg(cpu)                                        AS avg_cpu
        FROM sensor_data
        GROUP BY bucket, sensor_id
        -- WITH NO DATA
        ;


-- only start_offset is set to NULL
SELECT add_continuous_aggregate_policy('sensor_summary_daily',
    start_offset      => NULL,
    end_offset        => INTERVAL '1 day',
    schedule_interval => INTERVAL '1 MINUTE'
) new_job_id \gset

-- Also runs into the same error. i.e even with start_offset and end_offset, both set to NULL
-- SELECT add_continuous_aggregate_policy('one_day',
--     start_offset      => NULL,
--     end_offset        => NULL,
--     schedule_interval => INTERVAL '1 MINUTE'
-- ) new_job_id \gset

-- first call succeds but see the second fails
CALL run_job(:new_job_id);

-- second call fails with error
CALL run_job(:new_job_id);
ERROR:  timestamp out of range

-- attempting to insert data
INSERT into sensor_data (time, sensor_id, cpu, temperature) VALUES (now(), 1, 1.0, 1.0);

CALL run_job(:new_job_id);
ERROR:  timestamp out of range

CALL refresh_continuous_aggregate('sensor_summary_daily', NULL, '2023-03-21 05:00:00+00' );
ERROR:  timestamp out of range

hardikm10 avatar Mar 22 '23 06:03 hardikm10

I'm seeing the same, just when calling CREATE MATERIALIZED VIEW to create a C-Agg based on another C-Agg (which in turn is based on another...)

Inserting a single row into the hyper table allows creating all the continuous aggregates to succeed, but it means that I am unable to dump & restore the schema

segiddins avatar Apr 03 '23 01:04 segiddins

Hello Team, I've noticed the recurrence of this error in tsdb version 2.13. This can be reproduced with the steps shared by @hardikm10.

yadhupk avatar Feb 09 '24 02:02 yadhupk

@yadhupk thank you for reporting that this is still an issue. I'll let our team working on continuous aggregates take a look.

erimatnor avatar Feb 09 '24 09:02 erimatnor

@erimatnor Do we have a progress status update by any chance? Client reached to us as this is still causing disruptions on their side requiring them to manually track and refresh the failing hierarchical caggs. Having an ETA for this fix would be highly appreciated.

dariogit-bit avatar Feb 19 '24 12:02 dariogit-bit

@dariogit-bit it was fixed in #6660 and will be part of the upcoming 2.14.2

fabriziomello avatar Feb 19 '24 13:02 fabriziomello

I am having the same (or at least similar) issue.

  • Multiple CAGGs working fine.
  • Monthly does not
SELECT remove_continuous_aggregate_policy('dpv_monthly_summary_view');
SELECT add_continuous_aggregate_policy('dpv_monthly_summary_view',
            start_offset => NULL,
            end_offset => INTERVAL '1 month',
            schedule_interval => INTERVAL '1 hour')

Update to 2.14.2 did not solve the problem.

Adding start_offset => INTERVAL '4 month', works around it

mindrunner avatar Feb 23 '24 12:02 mindrunner

Hello @hardikm10,

I was able to reproduce the issue with TimescaleDB 2.14.2 and our latest development version (9f2d4e983e1f554cfe8c78b0187dfcf9cc902074). The NULL value for the start_offset of variable-sized buckets (e.g., a timezone is specified) is not handled properly in the CAgg refresh job. I will open a PR to fix the issue soon.

jnidzwetzki avatar Mar 04 '24 15:03 jnidzwetzki

@jnidzwetzki Thank you, Jan!

hardikm10 avatar Mar 05 '24 04:03 hardikm10