gpdb
gpdb copied to clipboard
GPDB7 REGRESSION: pg_partition_columns "comparable view" in documentation is not comparable
Bug Report
In the Greenplum 7 documentation there is a section that attempts to provide a comparable view for the loss of the previous pg_partition_columns. The view however is not comparable in one critical aspect.
The pg_partition_columns keyed the output on the parent table and provided the column name (assume just 1 per partition level for simplicity) that the partition is using at that level. The "comparable view" provided in the documentation isn't doing that and instead, it appears, is keying off some other partition level for the table name. I'll provide an example and show the difference:
CREATE TABLE RP_TESTS.RP_DATE_DEFAULT_LIST_SUBP ( RP_INT INTEGER, RP_DATE DATE, RP_TIMESTAMP TIMESTAMP, RP_DIST_KEY INTEGER NOT NULL, RP_SUBPART CHAR(1) ) WITH ( APPENDONLY=TRUE ,COMPRESSTYPE=ZLIB ,COMPRESSLEVEL=9 ,ORIENTATION=COLUMN ) DISTRIBUTED BY (RP_DIST_KEY) PARTITION BY RANGE(RP_DATE) SUBPARTITION BY LIST(RP_SUBPART) SUBPARTITION TEMPLATE ( SUBPARTITION A VALUES ('A'), SUBPARTITION B VALUES ('B'), SUBPARTITION C VALUES ('C'), DEFAULT SUBPARTITION default_sub_part ) ( START (DATE '2015-06-01') INCLUSIVE END (DATE '2016-01-01') EXCLUSIVE EVERY (INTERVAL '1 DAY'), DEFAULT PARTITION default_part );
So, using this example. This is what pg_partition_columns give me in releases prior to Greenplum 7:
gpadmin=# select * from pg_partition_columns where tablename = 'rp_date_default_list_subp'; schemaname | tablename | columnname | partitionlevel | position_in_partition_key ------------+---------------------------+------------+----------------+--------------------------- rp_tests | rp_date_default_list_subp | rp_date | 0 | 1 rp_tests | rp_date_default_list_subp | rp_subpart | 1 | 1 rp_tests | rp_date_default_list_subp | rp_subpart | 1 | 1
so I can easily tell from this view that the rp_date_default_list_subp table has 2 partitionlevels and the first run is using column rp_date and the second one is using column rp_subpart. This is very clean.
Doing same query using provided "comparable view" from documentation:
SELECT c.relnamespace::regnamespace AS schemaname, c.relname AS tablename, att.attname AS columnname, ( SELECT level FROM pg_partition_tree(pg_partition_root(c.oid)) WHERE relid = c.oid ) AS partitionlevel, 1 AS position_in_partition_key FROM pg_partitioned_table pt LEFT JOIN pg_class c ON c.oid = pt.partrelid JOIN pg_attribute att ON c.oid = att.attrelid AND att.attnum = pt.partattrs[0] where relname = 'rp_date_default_list_subp';
schemaname | tablename | columnname | partitionlevel | position_in_partition_key ------------+---------------------------+------------+----------------+--------------------------- rp_tests | rp_date_default_list_subp | rp_date | 0 | 1 (1 row)
so this view misses the sub-partition. The reason is because the tablename in this query is picking up the partition tablename as opposed to the parent table name so the query returns a row for each partition (which is extremely messy to deal with and not at comparable to the convenience of what the previous view provided.
For example, changing the where clause to do a like condition "where relname like 'rp_date_default_list_subp%';
returns a bunch of rows and you can see the tablename is the partitionname:
schemaname | tablename | columnname | partitionlevel | position_in_partition_key ------------+----------------------------------------------+------------+----------------+--------------------------- rp_tests | rp_date_default_list_subp | rp_date | 0 | 1 rp_tests | rp_date_default_list_subp_1_prt_default_part | rp_subpart | 1 | 1 rp_tests | rp_date_default_list_subp_1_prt_2 | rp_subpart | 1 | 1 rp_tests | rp_date_default_list_subp_1_prt_3 | rp_subpart | 1 | 1 rp_tests | rp_date_default_list_subp_1_prt_4 | rp_subpart | 1 | 1 rp_tests | rp_date_default_list_subp_1_prt_5 | rp_subpart | 1 | 1
The other issue with this "comparable view" is that the partitionlevel it returns starts from 0 (which is what prior Greenplum releases did return so that's good), HOWEVER, the gp_toolkit.gp_partitions view returns partitionlevel starting with 1 which is extremely confusing since it would be common to join these 2 views.
Greenplum version or build
7.1
OS version and uname -a
ALL
autoconf options used ( config.status --config )
Installation information ( pg_config )
Expected behavior
I need what the prior pg_partition_columns provided
Actual behavior
a mess :-)
Step to reproduce the behavior
See above in description and compare Greenplum 6 to 7
As a side note, to me, as a user, I would like to see the function pg_get_partkeydef() enhanced to simply pass in the partitionlevel I want the key from (default to lowest partitionlevel for backward compatibility). That would be very clean and seems like the best path forward
@huansong Please can you help look into this and provide recommended next steps. Thanks.
Thanks for reporting.
The query was indeed not correct on https://docs.vmware.com/en/VMware-Greenplum/7/greenplum-database/install_guide-migrate-classic-partitioning.html. This is more like it:
SELECT DISTINCT ON (partitionlevel)
c.relnamespace::regnamespace AS schemaname,
pg_partition_root(c.oid)::regclass AS tablename,
att.attname AS columnname,
(
SELECT level
FROM pg_partition_tree(pg_partition_root(c.oid))
WHERE relid = c.oid
) AS partitionlevel,
1 AS position_in_partition_key
FROM
pg_partitioned_table pt
LEFT JOIN
pg_class c ON c.oid = pt.partrelid
JOIN
pg_attribute att ON c.oid = att.attrelid AND att.attnum = pt.partattrs[0];
It would give result as:
schemaname | tablename | columnname | partitionlevel | position_in_partition_key
------------+---------------------------+------------+----------------+---------------------------
public | rp_date_default_list_subp | rp_date | 0 | 1
public | rp_date_default_list_subp | rp_subpart | 1 | 1
(2 rows)
I'll work with Docs team to get it corrected.
Regarding the levels: we made the conscious choice to +1 level for gp_partitions in order to not confuse JOIN with the upstream view/functions like pg_partition_tree. But agreed we should align the query in the docs with gp_partitions too. I'll get that corrected as well.
Thank you, this is very helpful but it didn't work in my environment where I have multiple partitioned tables. Here is what I get when I run the query, see what I changed in it to, I think, get it working:
gpadmin=# SELECT DISTINCT ON (partitionlevel) c.relnamespace::regnamespace AS schemaname, pg_partition_root(c.oid)::regclass AS tablename, att.attname AS columnname, ( SELECT level FROM pg_partition_tree(pg_partition_root(c.oid)) WHERE relid = c.oid ) AS partitionlevel, 1 AS position_in_partition_key FROM pg_partitioned_table pt LEFT JOIN pg_class c ON c.oid = pt.partrelid JOIN pg_attribute att ON c.oid = att.attrelid AND att.attnum = pt.partattrs[0];
I think the problem though is just with the DISTINCT ON. I changed the query to this and now it seems to work:
SELECT DISTINCT ON (partitionlevel, c.relnamespace::regnamespace, pg_partition_root(c.oid)::regclass) c.relnamespace::regnamespace AS schemaname, pg_partition_root(c.oid)::regclass AS tablename, att.attname AS columnname, ( SELECT level FROM pg_partition_tree(pg_partition_root(c.oid)) WHERE relid = c.oid ) AS partitionlevel, 1 AS position_in_partition_key FROM pg_partitioned_table pt LEFT JOIN pg_class c ON c.oid = pt.partrelid JOIN pg_attribute att ON c.oid = att.attrelid AND att.attnum = pt.partattrs[0];
I think the problem though is just with the DISTINCT ON. I changed the query to this and now it seems to work: SELECT DISTINCT ON (partitionlevel, c.relnamespace::regnamespace, pg_partition_root(c.oid)::regclass)
Yes agreed. Distinct on only the level is not enough. Thanks for providing the feedbacks!
I would like to see the function pg_get_partkeydef() enhanced to simply pass in the partitionlevel
I missed this in earlier reply. It is again one of the things that's tricky with heterogenous partition structures: what to return if there are different partition keys at the same level? Perhaps we could return multiple lines, each for a different partition key. But the performance would be bad - even for homogeneous structures we need to check through all partitions at that level. Another thing is that we cannot modify pg_get_partkeydef() in 7X as that'll be a catalog change. So instead of modifying that function we can think about what new views/UDFs that we can add to gp_toolkit (alongside existing ones like gp_partitions). For example, a function like pg_get_first_child_partkey(root_oid, level) which returns the same info as you described but only for the first child, with the user knowing that they are dealing with a homogeneous structure so the first child is enough. Does that sound good?