pgstac
pgstac copied to clipboard
Setting up time partitioning after initial data load
The documentation states that Partitioning needs to be set up before data is loaded. I'm curious if it's possible to set up time partitioning after the initial data load. I've noticed that PostgreSQL provides the ATTACH PARTITION sub-command, which has led me to consider the following steps:
- Generate tables (partitions) for a specific collection and transfer the corresponding records from
_items_1. - Remove items from the
_items_1partition. - Utilize the
ATTACHcommand to link the partitions created in the first step toitems. - Define
partition_truncfor the collection.
Is there something crucial that I might be overlooking?
@drnextgis it looks like that section of the documentation needs to be updated. There are now triggers on the collections table that will rewrite a full collection into new partitions if the partition strategy is changed.
https://github.com/stac-utils/pgstac/blob/634122f7736da84c4168c2512498766f87c96ad2/src/pgstac/sql/003b_partitions.sql#L455
I will note that on a large collection, this could take a very long time and take up some aggressive locks! So as much as possible, you should still try to set the partition strategy before loading any data.