volkszaehler.org icon indicating copy to clipboard operation
volkszaehler.org copied to clipboard

vzcompress uses unknown id field for data table

Open finnjet opened this issue 3 years ago • 6 comments

I started with a clean install of volkszaehler earlier this year and migrated the data from my old server using dbcopy.

Since then I did not get the vzcompress2 script to work. Here is the output

12/02/2021 04:51:45 PM - SQL:SELECT entities.*, properties.value AS name FROM entities INNER JOIN properties ON properties.entity_id = entities.id WHERE properties.pkey = ? AND class = ?
12/02/2021 04:51:45 PM - Processing channel 13df0000-097b-11eb-a092-174646e1882d (Bathroom)...
12/02/2021 04:51:45 PM - SQL:SELECT MIN(timestamp) AS min, MAX(timestamp) AS max FROM data WHERE channel_id = ? AND timestamp <= ? AND timestamp > ?
12/02/2021 04:51:45 PM - Skipping compression pass for data points between 11/02/2021 04:51:45 PM and 11/25/2021 04:51:45 PM using a 60 seconds window: No data points found
12/02/2021 04:51:45 PM - SQL:SELECT MIN(timestamp) AS min, MAX(timestamp) AS max FROM data WHERE channel_id = ? AND timestamp <= ? AND timestamp > ?
12/02/2021 04:51:45 PM - Processing: 06/05/2021 05:56:28 PM - 06/05/2021 06:01:28 PM (0%)... 08:46:07 PM using a 300 seconds window
12/02/2021 04:51:45 PM - SQL:SELECT AVG(value) AS newval, COUNT(value) AS datapoints, MAX(id) AS updateid FROM data WHERE channel_id = ? AND timestamp > ? AND timestamp <= ?PHP Fatal error: Uncaught PDOException: SQLSTATE[42S22]: Column not found: 1054 Unknown column 'id' in 'field list' in /home/scripts/volkszaehler.org/vendor/doctrine/dbal/lib/Doctrine/DBAL/Driver/PDOStatement.php:112

I have added an additional debug output to see the actual SQL. And that leaves me puzzled: data table does not have an id field. The structure is

channel_id PrimaryIndex int(11) No None
timestamp Primary bigint(20) No None
value double No None

Which seems to be the intended structure. When executing

bin/doctrine orm:schema-tool:create --dump-sql > database.sql This is the create for data:

CREATE TABLE data (timestamp BIGINT NOT NULL, channel_id INT NOT NULL, value DOUBLE PRECISION NOT NULL, INDEX IDX_ADF3F36372F5A1AA (channel_id), PRIMARY KEY(channel_id, timestamp)) DEFAULT CHARACTER SET utf8 COLLATEutf8_unicode_ci ENGINE = InnoDB;

Looking at an sql dump of my previous server it did seem to have an id field before.

Am I missing something? Was the table structure changed but vzcompress was not updated?

finnjet avatar Dec 10 '21 08:12 finnjet

This looks like a bug. id was removed when we optimized the schema for the >1 billion records in the demo installation. vzcompress was not updated then. PR welcome.

andig avatar Dec 10 '21 08:12 andig

The problem starts in https://github.com/volkszaehler/volkszaehler.org/blob/7225eac0a6582b4b8ae25a60d5cc506310fabddb/bin/vzcompress2#L260. The id is used to delete all rows in the timespan expect for one and then updates that one. Instead we need to delete all records and create a new one with the aggregated data.

/cc @jahir

andig avatar Dec 10 '21 08:12 andig

would it be safe to use the exact timestamp instead of the id? So to delete the "other" entries do "FROM data WHERE channel_id = ? AND timestamp > ? AND timestamp < ?", instead of "FROM data WHERE channel_id = ? AND timestamp > ? AND timestamp <= ?" And do the update to the exact timestamp?

or is this unsafe because there can be more than one datapoint for a given timestamp?

finnjet avatar Dec 10 '21 08:12 finnjet

PRIMARY KEY(channel_id, timestamp), so using these two would be safe.

jahir avatar Dec 10 '21 08:12 jahir

Remember the next statement then needs be an INSERT, not an UPDATE

andig avatar Dec 10 '21 09:12 andig

So the solution is : replace id with timestamp

  1. "MAX(id) AS updateid" with "MAX(timestamp) AS updateid"
  2. "DELETE ... and id != ?," with "DELETE ... and timestamp != ?,"
  3. "UPDATE data ... AND id = ?" with "UPDATE data ... AND timestamp = ?" right?

opensprinklershop avatar Mar 15 '22 21:03 opensprinklershop