cassabon icon indicating copy to clipboard operation
cassabon copied to clipboard

Create a migration script for when rollup values change

Open jeffpierce opened this issue 9 years ago • 2 comments

Right now, we make the choice that if a rollup value changes for a path, we're basically orphaning data for the windows that changed.

Let's see if we can come up with a way to do a migration based on the window changes and existing databases (or old windows).

jeffpierce avatar Oct 23 '15 18:10 jeffpierce

For discussion purposes, here are our current definitions:

rollups:
    default:
        retention:
            - 6s:6h
            - 1m:7d
            - 1h:30d
            - 6h:365d
        aggregation: average

The implications of various modifications to this definition are as follows:

Changing the retention period on one rollup: 6s:6h -> 6s:8h

The retention period determines the name of the table in which the data is stored, so this change makes all previously written data inaccessible. Recovery: Copy the data to the new table, and delete it from the old.

Changing the rollup window on one rollup: 6s:6h -> 10s:6h

New data is written to the same table as before, but at different time intervals. Old data can be perceived as being partial data for the new window definition, and how well this works depends on the relationship between the old and the new window size. Going from 6 seconds to 12 seconds would produc flawlessly-combined data; other changes could show erratic or wildly wrong data. We do not store enough information to recover from this.

Adding a new window:retention

This simply causes new data to be captured, with no migration implications.

Deleting a window:retention

This renders the data inaccessible, but it still present occupying space. It should probably be deleted.

Notes

  • Configuring two rollup windows with the same retention period will cause data to be scrambled, as there is no way of distinguishing the two data sets for the same path within the one table. This should probably be prevented during configuration load.
  • Deleting the data for a rollup involves inspecting every path present in the database to see if it matches the expression that defines the rollup. This must of course be done using the same algorithm used to select it when it was originally rolled up.

mredivo avatar Oct 23 '15 21:10 mredivo

Yeah, any attempt at this would be a best case effort, so if we don't have the data to rebuild a new table, we just don't have it.

As far as deleting the data goes, it shouldn't be a ton of work to match it against the stat index, especially once we implement other matching other than wildcards.

jeffpierce avatar Oct 23 '15 21:10 jeffpierce