pg_partman Is there a way to direct dropped partitions into another (partitioned) archive table?

This might be a case of 'XY' problem since I've only begun learning about partitioning the past week, but I have a time history table for an IoT-type workload I'm trying to wrangle (~500k records are appended per day containing id, bigint timestamp as epoch ms, & values). It's also worth noting I'm using Postgres 11 and stuck at version 4.0.0 of pg_partman (Azure).

The one requirement I have is that the last 12+1 months of records need to be kept available, but older data still needs to be stored.

I thought it might be best to create a history table with 13 monthly partitions and an archive table that is exactly the same with no retention limit set. The motivation is that I'm wary of creating too many partitions and, if a query doesn't take advantage of constraints, it seems best to limit it to a subset of the data. I wondered if there might then be a way to automatically move and append old partitions to the archive table. Although I'm unsure of this plan since ~15,000,000 rows per partition might be pushing it. I'm out of my depth, if you can tell.

My question is: does pg_partman support automatically moving partitions from one table to another? I can't find in the documentation if there is a function or policy that will move partitions past the retention period to another table. I looked into setting retention_schema but it migrates each partition as a stand-alone table. I then found partition_data_(time/proc), but I think that moves the whole table.

If not, I guess I could really use some advice/wisdom for my actual problem - this big table being big. (Also, should I apply a timestamp constraint to the parent? It's currently got one on to_timestamp(timestamp/1000), but I'd like to be able to do range queries with epoch time)

Thanks for any help/advice, and thank you for the awesome work and help you provide, keith!

Aug 10 '21 16:08 Aweptimum

You found the retention_schema option which is the closest thing I currently have supporting that right now. partition_data_* is primarily for moving data when initially setting up a partition set from an existing able or moving it out of the default table.

You could have another process that re-attaches the table to a different table, but pg_partman does not currently support that at this time. Honestly, you may just want to leave them as separate tables unless you really find you will need to frequently access old data. The naming patterns for pg_partman are predictable, so you could always write queries to target the specific child tables that have the old data you're looking for.

I don't see a need for making a constraint on the parent table. The entire partition set will always inherently have the constraint of the oldest and newest data and anything outside those boundaries will go into the default. Or error out if you opt to not have a default. I'd recommend just monitoring for data to go into the default if you have one and handling it ASAP from there.

Aug 19 '21 20:08 keithf4

@keithf4 Awesome, thanks for the info! I was thinking that keeping only the last year of data in an active table and moving stuff older than that to another would be "more performant", but that was born out of my misunderstandings I think. I wasn't aware of check constraints until a week ago.

I have also written a few functions to work with the partitions since asking this question, mostly for clustering a parent table's partitions (getting all child tables, getting their pkey, and running an EXECUTE 'ALTER TABLE [table] CLUSTER ON [pkey]). The logic could also be extended to include other maintenance commands, such as REINDEX. I think my implementation could be improved (and maybe recurse to sub-partitions if I spend more time on it), but is such functionality in the scope of pg_partman? It's basically the native counterpart to apply_cluster, but I don't think I'm the first to write such a routine. If it is in scope, I'd be happy to share it.

Aug 20 '21 19:08 Aweptimum

As of PG12+, it's actually using partition pruning which is more performant than plain constraint exclusion. I'm not sure the extent of the overhead for partition pruning, I believe there is still some there vs a standalone table. I would honestly test and see how things are performing and if it's working ok, then you don't need to worry about moving the tables out of the partition set until you're actually ready to expire the data.

Some functions to reindex/cluster a partition set could definitely be useful. Feel free to create a PR. Thank you!

Aug 20 '21 20:08 keithf4

Hey @keithf4 I’ve been thinking on adding the clustering functionality, and I realized that what I’d written is good for my use-case, but probably not in the generic. I’ve been thinking the past few days about how best to handle clustering without assuming the user wants to use the parent table’s primary key. Below are some ideas and snippets.

For getting a partitioned table’s pkey definition, I came up with the query below (tested on PG11 and 13). It circumvents the PG11 bug with partitioned tables’ indexes being absent in some of the pg_catalog tables. I think this query can be added to apply_cluster in the IF-block at line 26

SELECT 
	substring(pg_get_indexdef(idx.indexrelid) from ' USING .*$') AS index_def,
	t.relname as tablename, 
	i.relname as indexname, 
	t.OID, idx.indisprimary, 
	idx.indisclustered
FROM pg_class i
JOIN pg_index idx on idx.indexrelid = i.oid
JOIN pg_class t on t.oid = idx.indrelid
WHERE
		i.relkind = 'I'
	AND t.relname = 'ifm_object_monitor_history_new';

While the parent can't have a clustered index, this query can be used to cluster on the primary key of the parent table. Although as mentioned, pkey isn’t the best choice for every table.

I think a better solution would be to either copy or completely move clustering out of partition creation and into regular maintenance. The primary motivation for this is that it could be possible to cluster partitions on their most frequently used index.

The assumption is that partitions have had some queries run on them before they fall outside of the optimize_constraints value for their parent. With this assumption, the pg_stat_all_indexes table's number_of_scans column can be used to get the index with the maximum number of reads. Either per-parititon or across all partitions for a parent. Then an ALTER TABLE [] CLUSTER ON [frequently_used_index] statement can be executed. I don't know how this would benefit id-based partitioning, but this would fit time-based really well.

I've so far worked out a query that gets the index w/ max scans per partition:

SELECT DISTINCT ON (child)
	child.relname       AS child,
    psai.indexrelname	AS index_name,
    psai.idx_scan		AS scans
FROM pg_inherits
    JOIN pg_class parent            ON pg_inherits.inhparent = parent.oid
    JOIN pg_class child             ON pg_inherits.inhrelid   = child.oid
    JOIN pg_namespace nmsp_parent   ON nmsp_parent.oid  = parent.relnamespace
    JOIN pg_namespace nmsp_child    ON nmsp_child.oid   = child.relnamespace
    LEFT JOIN pg_index i ON child.oid = i.indrelid
    LEFT JOIN pg_stat_all_indexes psai ON i.indexrelid = psai.indexrelid
WHERE
        child.reltype > 0
    AND pg_get_expr(child.relpartbound, child.oid) != 'DEFAULT'
ORDER BY child, scans DESC

It could be filtered further to only grab non-clustered partitions as well, or to only select one table/partition at a time (just too lazy to do it atm). DISTINCT ON (child) bit can be removed to get all index scans.

Once clustered on an index, other maintenance tasks can then be run - chiefly CLUSTER and VACUUM FULL ANALYZE. The locking might be an issue, but for old partitions that are most likely not going to undergo further modification, these two commands only need to be run once, just like optimize_constraints.

As for REINDEX, it’s unofficially official that it’s a redundant command to run after a CLUSTER operation because the table is built from scratch (src).

I originally wanted to implement all this as functions for my personal use-case, but every time I hit upon something I've thought "dang, I wish I could tinker with partman". However, if any of this is outside scope, let me know and I’ll try to add just what makes sense (might be a while since I can’t build partman from source). Any thoughts/ideas are also welcome since I’ve been chasing a lot of my own the past week and now feel I don’t have as much clarity as I did a week ago.

Could you point me to where the optimize_constraints task is run in partman, though? I'm having trouble finding it. Thanks for making it through this wall of text ❤️

Sep 02 '21 14:09 Aweptimum

Apologies for the delay in response. Hadn't looked back at pg_partman issues for a while and catching up now.

The constraint optimization is called in the create_partition_*() functions.

https://github.com/pgpartman/pg_partman/blob/master/sql/functions/create_partition_id.sql#L327 https://github.com/pgpartman/pg_partman/blob/master/sql/functions/create_partition_time.sql#L432

The apply_cluster() function is also called there as well, but is only called on non-natively partitioned tables and relies on looking up the clustering on the current parent table to determine things.

https://github.com/pgpartman/pg_partman/blob/master/sql/functions/create_partition_time.sql#L329

If you'd want to add the functionality to cluster similar to how the optimize_constraints works, that's fine. To keep it flexible how you are thinking, I'd probably just not bother trying to figure out anything automatically and let the user define the column(s) they want to cluster on.

Nov 17 '21 18:11 keithf4

Thinking more on this, though, I'm not sure how you'd determine that the cluster may or may not need to run anymore. Ideally it would only be a one-off operation on older tables, but if the user is actively still using the older tables and still desires clustering, then how does one figure that out?

I think clustering may be better to be moved out as an operation independent of normal maintenance. And have options to the function/procedure that can let the user more clearly define which children to actually run the operation on. Could have options for age (similar to optimize_constraint) but also other options that the user can more narrowly define which child tables to run it on.

Nov 17 '21 18:11 keithf4

Will review cluster handling again as part of 5.0 since I can then ignore any trigger-based partitioning issues around that. Marking this issue for that milestone.

Jun 25 '22 21:06 keithf4

Thinking more on this, though, I'm not sure how you'd determine that the cluster may or may not need to run anymore. Ideally it would only be a one-off operation on older tables, but if the user is actively still using the older tables and still desires clustering, then how does one figure that out

Oh yeah, I was only thinking about my use-case and forgot to consider non-temporal tables. I think your idea of splitting it out into a separate function with finer arguments is probably the best route.

Still, I had two thoughts on providing a way to automate some of the logic.

The first (maybe much easier) would be using the pg_stats correlation column. Let the user specify an absolute value threshold that the function can compare to and decide if it needs to cluster. Although querying it on my database returned results only for the newest 6 partitions of a table that has 39. I'll have to look into that :/

The second, if the first isn't useful, would be to use table estimates:

Add a row estimate column last_estimate to one of the partman catalogues, default it to 0
Running the cluster function first gets a current estimate of the row count for a partition
It then compares the current estimate to the catalogued estimate
If last_estimate/current_estimate <= 0.7 (or some user-configurable percentage threshold), then cluster it (or use some other condition)

On grabbing an estimate, Erwin Brandstetter has compiled a few queries that rely on the statistics kept by postgres. Querying reltuples with no checks takes a few ms, and the Safe and Explicit query runs in about 1-1.5 seconds. I tested both on a table with ~754886147 rows in a pg 11 azure instance.

If these aren't helpful/interesting, that's fine - just wanted to throws some ideas out.

Thanks for your continued work on partman!

Jun 27 '22 18:06 Aweptimum

Closing this issue for now. There is an existing apply_cluster() function that can be called on a custom schedule or as needed to re-cluster all children in a given partition set. Feel free to re-open this issue if there are any further questions.

Sep 21 '23 19:09 keithf4

pg_partman pg_partman copied to clipboard

Is there a way to direct dropped partitions into another (partitioned) archive table?

pg_partman
pg_partman copied to clipboard