volkszaehler.org
volkszaehler.org copied to clipboard
vzcompress uses unknown id field for data table
I started with a clean install of volkszaehler earlier this year and migrated the data from my old server using dbcopy.
Since then I did not get the vzcompress2 script to work. Here is the output
12/02/2021 04:51:45 PM - SQL:SELECT entities.*, properties.value AS name FROM entities INNER JOIN properties ON properties.entity_id = entities.id WHERE properties.pkey = ? AND class = ?
12/02/2021 04:51:45 PM - Processing channel 13df0000-097b-11eb-a092-174646e1882d (Bathroom)...
12/02/2021 04:51:45 PM - SQL:SELECT MIN(timestamp) AS min, MAX(timestamp) AS max FROM data WHERE channel_id = ? AND timestamp <= ? AND timestamp > ?
12/02/2021 04:51:45 PM - Skipping compression pass for data points between 11/02/2021 04:51:45 PM and 11/25/2021 04:51:45 PM using a 60 seconds window: No data points found
12/02/2021 04:51:45 PM - SQL:SELECT MIN(timestamp) AS min, MAX(timestamp) AS max FROM data WHERE channel_id = ? AND timestamp <= ? AND timestamp > ?
12/02/2021 04:51:45 PM - Processing: 06/05/2021 05:56:28 PM - 06/05/2021 06:01:28 PM (0%)... 08:46:07 PM using a 300 seconds window
12/02/2021 04:51:45 PM - SQL:SELECT AVG(value) AS newval, COUNT(value) AS datapoints, MAX(id) AS updateid FROM data WHERE channel_id = ? AND timestamp > ? AND timestamp <= ?PHP Fatal error: Uncaught PDOException: SQLSTATE[42S22]: Column not found: 1054 Unknown column 'id' in 'field list' in /home/scripts/volkszaehler.org/vendor/doctrine/dbal/lib/Doctrine/DBAL/Driver/PDOStatement.php:112
I have added an additional debug output to see the actual SQL. And that leaves me puzzled: data table does not have an id field. The structure is
channel_id PrimaryIndex int(11) No None
timestamp Primary bigint(20) No None
value double No None
Which seems to be the intended structure. When executing
bin/doctrine orm:schema-tool:create --dump-sql > database.sql This is the create for data:
CREATE TABLE data (timestamp BIGINT NOT NULL, channel_id INT NOT NULL, value DOUBLE PRECISION NOT NULL, INDEX IDX_ADF3F36372F5A1AA (channel_id), PRIMARY KEY(channel_id, timestamp)) DEFAULT CHARACTER SET utf8 COLLATE
utf8_unicode_ci ENGINE = InnoDB;
Looking at an sql dump of my previous server it did seem to have an id field before.
Am I missing something? Was the table structure changed but vzcompress was not updated?
This looks like a bug. id
was removed when we optimized the schema for the >1 billion records in the demo installation. vzcompress was not updated then. PR welcome.
The problem starts in https://github.com/volkszaehler/volkszaehler.org/blob/7225eac0a6582b4b8ae25a60d5cc506310fabddb/bin/vzcompress2#L260. The id
is used to delete all rows in the timespan expect for one and then updates that one. Instead we need to delete all records and create a new one with the aggregated data.
/cc @jahir
would it be safe to use the exact timestamp instead of the id?
So to delete the "other" entries do
"FROM data WHERE channel_id = ? AND timestamp > ? AND timestamp < ?",
instead of
"FROM data WHERE channel_id = ? AND timestamp > ? AND timestamp <= ?"
And do the update to the exact timestamp?
or is this unsafe because there can be more than one datapoint for a given timestamp?
PRIMARY KEY(channel_id, timestamp)
, so using these two would be safe.
Remember the next statement then needs be an INSERT, not an UPDATE
So the solution is : replace id with timestamp
- "MAX(id) AS updateid" with "MAX(timestamp) AS updateid"
- "DELETE ... and id != ?," with "DELETE ... and timestamp != ?,"
- "UPDATE data ... AND id = ?" with "UPDATE data ... AND timestamp = ?" right?