timescaledb icon indicating copy to clipboard operation
timescaledb copied to clipboard

Autorefresh of continuous aggregates not working

Open pkuehnel opened this issue 3 years ago • 18 comments

Relevant system information:

  • OS: Debian 10
  • PostgreSQL version: 12.8
  • TimescaleDB version: 2.5.0
  • Installation method: using Docker

Describe the bug Auto refresh of continuous aggregates using SELECT add_continuous_aggregate_policy not working.

To Reproduce Steps to reproduce the behavior:

  1. Create a hypertable eg.: SELECT create_hypertable('tasmotalogs', 'Timestamp', chunk_time_interval => INTERVAL '1 day', migrate_data => true);
  2. Create continuous aggregate on this hypertable eg.:
CREATE MATERIALIZED VIEW tasmotalogs_summary_daily
WITH (timescaledb.continuous) AS
SELECT "Topic",
   time_bucket(INTERVAL '1 day', "Timestamp") AS bucket,
   AVG("Power") as "Power",
   AVG("Efficiency")as "Efficiency",
   MAX("TotalEnergy")-MIN("TotalEnergy") as "EnergyProduced",
   AVG("MaxPower") as "MaxPower"
FROM tasmotalogs
GROUP BY "Topic", bucket;
  1. Create Refresh policy:
SELECT add_continuous_aggregate_policy('tasmotalogs_summary_daily',
     start_offset => INTERVAL '1 month',
     end_offset => INTERVAL '1 h',
     schedule_interval => INTERVAL '1 h');
  1. See error everytime the refresh job is executed:
2021-11-10 14:30:19.627 UTC [32894] ERROR:  cannot execute SQL without an outer snapshot or portal,
2021-11-10 14:30:19.591 UTC [32894] LOG:  refreshing continuous aggregate "tasmotalogs_summary_daily" in window [ 2021-10-11 00:00:00, 2021-11-10 00:00:00 ],
2021-11-10 14:30:19.627 UTC [32894] LOG:  job 1007 threw an error,
2021-11-10 14:30:19.627 UTC [32894] CONTEXT:  SQL statement "DELETE FROM _timescaledb_internal._materialized_hypertable_15 AS D WHERE D.bucket >= '2021-11-09 00:00:00' AND D.bucket < '2021-11-10 00:00:00' ;",
2021-11-10 14:30:19.629 UTC [1] LOG:  background worker "Refresh Continuous Aggregate Policy [1007]" (PID 32894) exited with exit code 1

Expected behavior The corresponding continuous aggregate should be updated.

Actual behavior The job which should update the continuous aggregate crashes and data is not updated (on initial creation of continous aggregate the data is filled properly but if you query on the continuous aggreate after executing ALTER MATERIALIZED VIEW tasmotalogs_summary_daily SET (timescaledb.materialized_only = true); there is no newer data than there was after creating the continuous aggreagte.

Additional context I already saw issue #3619 but as it is closed probably nobody looks into it anymore. The solution there was to upgrade to postgres 12.X but I am already on 12.8

pkuehnel avatar Nov 10 '21 14:11 pkuehnel

@pkuehnel Thank you for the bug report. This issue should be fixed in 2.5.0 (https://github.com/timescale/timescaledb/pull/3489). Can you add the output of \dx just to verify that it really is for 2.5.0?

I tried to reproduce it locally, but it runs fine:

CREATE TABLE conditions(
    time TIMESTAMP NOT NULL,
    device_id INTEGER,
    temperature FLOAT,
    humidity FLOAT
);

SELECT * FROM create_hypertable('conditions', 'time', 'device_id', 12);

INSERT INTO conditions
SELECT time, (random()*3 + 1)::int, random()*80 - 40, random()*100
FROM generate_series(now() - INTERVAL '28 days', now(), '1 hour') AS time;

CREATE MATERIALIZED VIEW conditions_summary_hourly
WITH (timescaledb.continuous) AS
SELECT device_id,
       time_bucket(INTERVAL '1 hour', "time") AS bucket,
       AVG(temperature),
       MAX(temperature),
       MIN(temperature)
FROM conditions
GROUP BY device_id, bucket
WITH NO DATA;

SELECT add_continuous_aggregate_policy('conditions_summary_hourly',
    start_offset => INTERVAL '1 month',
    end_offset => INTERVAL '1 h',
    schedule_interval => INTERVAL '1 minute');

Log contains:

mats@mats-XPS-13-9380:~/research/postgresql$ tail -f ~/.local/var/log/postgresql/13/postgres.log 
2021-11-11 10:12:01.978 CET [1159804] WARNING:  failed to launch job 1 "Telemetry Reporter [1]": failed to start a background worker
2021-11-11 10:12:03.057 CET [1159804] WARNING:  failed to launch job 1 "Telemetry Reporter [1]": failed to start a background worker
2021-11-11 10:12:03.402 CET [1199346] WARNING:  telemetry error: HTTP connection read error
2021-11-11 10:12:07.497 CET [1199347] WARNING:  telemetry error: HTTP connection read error
2021-11-11 10:18:08.791 CET [1199484] LOG:  refreshing continuous aggregate "conditions_summary_hourly" in window [ 2021-10-11 11:00:00, 2021-11-11 09:00:00 ]
2021-11-11 10:19:08.818 CET [1199495] LOG:  refreshing continuous aggregate "conditions_summary_hourly" in window [ 2021-10-11 11:00:00, 2021-11-11 09:00:00 ]

mkindahl avatar Nov 11 '21 09:11 mkindahl

I already checked the version. Is it possible that there are any issues because my database was created with an older version? Because I had that issue, after that I updated to 2.5.0, deleted all continuous aggregates an recreated them but still not fixed. Any ideas what I can do?

Edit: Will add \dx output in a few hours

pkuehnel avatar Nov 11 '21 09:11 pkuehnel

Here is the output:

                                      List of installed extensions
    Name     | Version |   Schema   |                            Description
-------------+---------+------------+-------------------------------------------------------------------
 plpgsql     | 1.0     | pg_catalog | PL/pgSQL procedural language
 timescaledb | 2.5.0   | public     | Enables scalable inserts and complex queries for time-series data
(2 rows)

pkuehnel avatar Nov 11 '21 09:11 pkuehnel

I just tried all the statements from you and get the same error:

2021-11-12 06:21:17.058 UTC [98454] LOG:  refreshing continuous aggregate "conditions_summary_hourly" in window [ 2021-10-12 07:00:00, 2021-11-12 05:00:00 ],
2021-11-12 06:21:17.068 UTC [98454] LOG:  job 1008 threw an error,
2021-11-12 06:21:17.068 UTC [98454] ERROR:  cannot execute SQL without an outer snapshot or portal,
2021-11-12 06:21:17.068 UTC [98454] CONTEXT:  SQL statement "DELETE FROM _timescaledb_internal._materialized_hypertable_49 AS D WHERE D.bucket >= '2021-10-12 07:00:00' AND D.bucket < '2021-11-12 05:00:00' ;",
2021-11-12 06:21:17.069 UTC [1] LOG:  background worker "Refresh Continuous Aggregate Policy [1008]" (PID 98454) exited with exit code 1

Even if I create a new database (but same timescaledb Container) I get the same error:

2021-11-12 06:25:35.470 UTC [98607] LOG:  refreshing continuous aggregate "conditions_summary_hourly" in window [ 2021-10-12 07:00:00, 2021-11-12 05:00:00 ],
2021-11-12 06:25:35.478 UTC [98607] LOG:  job 1000 threw an error,
2021-11-12 06:25:35.478 UTC [98607] ERROR:  cannot execute SQL without an outer snapshot or portal,
2021-11-12 06:25:35.478 UTC [98607] CONTEXT:  SQL statement "DELETE FROM _timescaledb_internal._materialized_hypertable_2 AS D WHERE D.bucket >= '2021-10-12 07:00:00' AND D.bucket < '2021-11-12 05:00:00' ;",
2021-11-12 06:25:35.480 UTC [1] LOG:  background worker "Refresh Continuous Aggregate Policy [1000]" (PID 98607) exited with exit code 1

pkuehnel avatar Nov 12 '21 06:11 pkuehnel

@pkuehnel Could you give the commands that you actually used to start the container and to connect to the container?

I started a new container using:

docker run -d --name timescaledb -p 5432:5432 -e POSTGRES_PASSWORD=password timescale/timescaledb:latest-pg12

And connected to the server using

psql -Upostgres -hlocalhost -p5432 postgres

mkindahl avatar Nov 12 '21 13:11 mkindahl

@mkindahl I am using the container with docker-compose up -d. The docker-compose.yml looks like that:

services:
  database:
    image: timescale/timescaledb:latest-pg12
    deploy:
        resources:
            limits:
                memory: 200M
    logging:
        driver: "json-file"
        options:
            max-file: "5"
            max-size: "10m"
    restart: always
    environment:
      - POSTGRES_PASSWORD=secret
    ports:
      - 5432:5432
    volumes:
      - ./timescaleDb/data:/var/lib/postgresql/data

I am using docker exec -it smartevcharge_database_1 psql -U postgres -X to connect to to the container.

pkuehnel avatar Nov 12 '21 13:11 pkuehnel

Your logs suggest that this stmt fails during a refresh.

DELETE FROM _timescaledb_internal._materialized_hypertable_15 AS D WHERE D.bucket >= '2021-11-09 00:00:00' AND D.bucket < '2021-11-10 00:00:00' ;
  1. Does a manual refresh_continuous_aggregate command work?
  2. You could try this in a transaction and rollback and see if you get more info on why this stmt fails. BEGIN; DELETE ... ROLLBACK;

gayyappan avatar Nov 15 '21 16:11 gayyappan

  1. Initial data is correct (when creating the aggregate), manual refresh does work (at least no error message).
  2. I do not get any error when executing Begin, Delete, Rollback. (There is no data in the time window the job wants to delete something). But what I get is the following Warning: 2021-11-16 06:13:20.635 UTC [94834] WARNING: there is no transaction in progress

pkuehnel avatar Nov 15 '21 20:11 pkuehnel

Ok, new test: I tried:

begin;
CALL refresh_continuous_aggregate('tasmotalogs_summary_hourly', '2021-09-09', '2021-11-16');
commit;

Got the warning, that there is no transaction in progress and no updates in the continuous aggregate. I then tried without begin; and commit; and the continuous aggregate got updated. Note: I use DBeaver as UI.

pkuehnel avatar Nov 16 '21 14:11 pkuehnel

Just like @mkindahl , I am unable to repro your issue on 12.8. We did have an issue with outer snapshot errors on PG12.8. But this was on a setup with PG12.8 + timescale < 2.5.0.

2021-11-19 11:30:33.898 EST [10263] DEBUG:  started background job 1000
2021-11-19 11:30:33.903 EST [10263] LOG:  refreshing continuous aggregate "conditions_summary_hourly" in window [ 2021-10-19 12:00:00, 2021-11-19 10:00:00 ]
2021-11-19 11:30:33.903 EST [10263] DEBUG:  hypertable 1 existing watermark >= new invalidation threshold 1637316000000000 1637316000000000
2021-11-19 11:30:33.910 EST [10263] DEBUG:  exiting job 1000 with success

gayyappan avatar Nov 19 '21 16:11 gayyappan

Can I help you anyhow? I could send you my whole container including all files. There is nothing private in the database.

pkuehnel avatar Nov 19 '21 16:11 pkuehnel

Or do you have any ideas how I can resolve that issue?

pkuehnel avatar Nov 19 '21 17:11 pkuehnel

@pkuehnel You can try to attach a debugger to the backend, set a breakpoint on the line where the error is generated (it's inside the function EnsurePortalSnapshotExists), and once you hit the error, print out a stack trace using bt. If you can paste that to the issue, there is at least a chance that we can see if anything is wrong with the image.

mkindahl avatar Nov 23 '21 08:11 mkindahl

As I am not that deep into timescaleDb I do not want to invest more time on this issue. As workaround I created a cronjob on my Docker host executing the following command everyday:

docker exec -i containerName -U postgres -d DatabaseName -c "CALL refresh_continuous_aggregate('tasmotalogs_summary_daily', localtimestamp - INTERVAL '1 month', localtimestamp - INTERVAL '1 hour');"

pkuehnel avatar Nov 23 '21 17:11 pkuehnel

@pkuehnel I am glad that you found a workaround. If you experience this issue again or have more information, please reopen it.

erimatnor avatar Nov 24 '21 07:11 erimatnor

PG 13.4 TDB: 2.7.0 Also docker on debian.

I have the same problem. After updating from PG 13.2 and TDB 2.5.0 my aggregate policies are no longer triggering. Deleting and recreating them does not help. They do execute once when I create the policy, but never again after that.

Splitframe avatar Aug 23 '22 10:08 Splitframe

Hi @Splitframe, thanks for reaching out to us. I tried the following to reproduce the issue you're facing, unfortunately without success:

  1. Started a container with pg12 and timescaledb 2.5.0: docker run -d --name timescaledb -p 5436:5432 -e POSTGRES_PASSWORD=password timescale/timescaledb:2.5.0-pg12
  2. connected to it with psql -Upostgres -hlocalhost -p5436 postgres
  3. created the CAGG and refresh policy given above - with a schedule_interval of 30 seconds instead of 1 minute.
  4. updated to 2.7.0 following the instructions provided in our documentation
  5. continuous aggregate refresh policy continued to run as expected:
2022-08-30 10:20:50.538 UTC [1] LOG:  starting PostgreSQL 12.11 on aarch64-unknown-linux-musl, compiled by gcc (Alpine 10.3.1_git20211027) 10.3.1 20211027, 64-bit

2022-08-30 10:20:50.538 UTC [1] LOG:  listening on IPv4 address "0.0.0.0", port 5432

2022-08-30 10:20:50.538 UTC [1] LOG:  listening on IPv6 address "::", port 5432

2022-08-30 10:20:50.541 UTC [1] LOG:  listening on Unix socket "/var/run/postgresql/.s.PGSQL.5432"

2022-08-30 10:20:50.577 UTC [22] LOG:  database system was shut down at 2022-08-30 10:20:37 UTC

2022-08-30 10:20:50.581 UTC [1] LOG:  database system is ready to accept connections

2022-08-30 10:20:50.583 UTC [28] LOG:  TimescaleDB background worker launcher connected to shared catalogs

2022-08-30 10:20:50.601 UTC [31] LOG:  refreshing continuous aggregate "conditions_summary_hourly" in window [ 2022-07-30 11:00:00, 2022-08-30 09:00:00 ]

2022-08-30 10:21:20.621 UTC [37] LOG:  refreshing continuous aggregate "conditions_summary_hourly" in window [ 2022-07-30 11:00:00, 2022-08-30 09:00:00 ]

2022-08-30 10:21:50.636 UTC [44] LOG:  refreshing continuous aggregate "conditions_summary_hourly" in window [ 2022-07-30 11:00:00, 2022-08-30 09:00:00 ]

Might it be possible to provide the specific steps you followed, as well as your log output? That could help us debug this

konskov avatar Aug 30 '22 10:08 konskov

Hello @konskov,

thank you for your reply. I temporarily managed a workaround with cronjobs and manual refreshes. As soon as I come around to check up on this again I will posts a report and the logs.

Splitframe avatar Aug 30 '22 12:08 Splitframe

Hello, we are facing similar issue: PG: 11.15 TSDB: 2.3.0 OS: Linux Version: Red Hat Enterprise Linux Server 7.9 Database was created with TSDB version 1.x and upgraded. Continous aggr. was created before upgrade. We have sensor data tables where we have timestamp and sensor values, continuous aggr. was created using script like that:

CREATE MATERIALIZED VIEW sensor_data_3h_agg WITH ( timescaledb.continuous, timescaledb.materialized_only=false ) AS SELECT time_bucket('3H', sensor_ts) as sensor_ts, avg(sensor_value) as avg_sensor_value, max(sensor_value) as max_sensor_value, min(sensor_value) as min_sensor_value, first(sensor_value, sensor_ts) as first_sensor_value FROM sensor_data GROUP BY time_bucket('3H', sensor_ts) WITH NO DATA;

SELECT add_continuous_aggregate_policy('sensor_data', start_offset => NULL, end_offset => INTERVAL '6 h', schedule_interval => INTERVAL '1 h');

Background job is failing.

DB log: LOG: job 1101 threw an error ERROR: cannot execute SQL without an outer snapshot or portal CONTEXT: SQL statement "DELETE FROM _timescaledb_internal._materialized_hypertable_224 AS D WHERE D.sensor_ts >= '2022-08-28 03:00:00' AND D.sensor_ts < '2022-09-26 18:00:00' ;" LOG: background worker "Refresh Continuous Aggregate Policy [1101]" (PID 25087) exited with exit code 1

majozv avatar Sep 29 '22 13:09 majozv

@majozv You are using a very old version that is no longer supported. That specific error appearing in your logs should not happen on recent postgres and timescaledb versions.

svenklemm avatar Oct 01 '22 18:10 svenklemm

@svenklemm Hi, thanks a lot. We are going to upgrade postgresql and timescaledb version on our environment.

majozv avatar Oct 03 '22 08:10 majozv

Dear Author,

This issue has been automatically marked as stale due to lack of activity. With only the issue description that is currently provided, we do not have enough information to take action. If you have or find the answers we would need, please reach out. Otherwise, this issue will be closed in 30 days. Thank you!

github-actions[bot] avatar Dec 03 '22 01:12 github-actions[bot]

Dear Author,

We are closing this issue due to lack of activity. Feel free to add a comment to this issue if you can provide more information and we will re-open it. Thank you!

github-actions[bot] avatar Jan 02 '23 01:01 github-actions[bot]