gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

[6X] Invalid rule definition for signed values

Open hughcapet opened this issue 3 years ago • 6 comments
trafficstars

This PR is a backport of changes made in the master branch while merging 9.5. This fixes the problem of incorrect printing of signed constants. This is in particular very important for gpbackup, as it uses pg_get_partition_def func for metadata dumping:

CREATE TABLE test (id int) 
DISTRIBUTED BY (id)
PARTITION BY LIST (id)
( PARTITION negative VALUES (-1, 1), 
  PARTITION zero VALUES (0), 
  DEFAULT PARTITION other );

-- The query gpbackup uses for metadata gathering

SELECT p.parrelid AS oid,
pg_get_partition_def(p.parrelid, true, true) AS definition,
pg_get_partition_template_def(p.parrelid, true, true) AS template
FROM pg_partition p
JOIN pg_class c ON p.parrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE c.oid = 'test'::regclass;
  oid  |                                                definition                                                 | template 
-------+-----------------------------------------------------------------------------------------------------------+----------
 93355 | PARTITION BY LIST(id)                                                                                    +| 
       |           (                                                                                              +| 
       |           PARTITION negative VALUES(1, (-1)) WITH (tablename='test_1_prt_negative', appendonly='false'), +| 
       |           PARTITION zero VALUES(0) WITH (tablename='test_1_prt_zero', appendonly='false'),               +| 
       |           DEFAULT PARTITION other  WITH (tablename='test_1_prt_other', appendonly='false')               +| 
       |           )                                                                                               | 
(1 row)

-- Try to use the printed definition

create table test (id int)
 PARTITION BY LIST(id)                                                                                   
           (                                                                                             
           PARTITION negative VALUES(1, (-1)) WITH (tablename='test_1_prt_negative', appendonly='false'),
           PARTITION zero VALUES(0) WITH (tablename='test_1_prt_zero', appendonly='false'),              
           DEFAULT PARTITION other  WITH (tablename='test_1_prt_other', appendonly='false')              
           );
ERROR:  syntax error at or near "("
LINE 4:            PARTITION negative VALUES(1, (-1)) WITH (tablenam...

Consequently, gpbackup fails while this table is being restored.

Behavior after the fix:

SELECT p.parrelid AS oid,
pg_get_partition_def(p.parrelid, true, true) AS definition,
pg_get_partition_template_def(p.parrelid, true, true) AS template
FROM pg_partition p
JOIN pg_class c ON p.parrelid = c.oid
JOIN pg_namespace n ON c.relnamespace = n.oid
WHERE c.oid = 'test'::regclass;
  oid  |                                                definition                                                 | template 
-------+-----------------------------------------------------------------------------------------------------------+----------
 93355 | PARTITION BY LIST(id)                                                                                    +| 
       |           (                                                                                              +| 
       |           PARTITION negative VALUES(1, '-1') WITH (tablename='test_1_prt_negative', appendonly='false'), +| 
       |           PARTITION zero VALUES(0) WITH (tablename='test_1_prt_zero', appendonly='false'),               +| 
       |           DEFAULT PARTITION other  WITH (tablename='test_1_prt_other', appendonly='false')               +| 
       |           )                                                                                               | 
(1 row)

The second commit unifies partition rule definition for all type of rules: start, end, every. PartEvery bound specification is now coerced unconditionally (regardless the rtypeId and newrtypeId comparison result).The aim is only to make the printed rules be consistent (i.e. avoid FuncExpr in partEvery, while partStart and partEnd were coerced to const). As an example, output of pg_get_partition_def function before the fix

postgres=# select pg_get_partition_def('test_table'::regclass, true);
                                         pg_get_partition_def                                         
------------------------------------------------------------------------------------------------------
 PARTITION BY RANGE(b)                                                                               +
           (                                                                                         +
           START ('1'::double precision) END ('100'::double precision) EVERY (20::double precision), +
           PARTITION p1 START ('100'::double precision) END ('150'::double precision)                +
           )
(1 row)

and after the fix

------------------------------------------------------------------------------------------------------
 PARTITION BY RANGE(b)                                                                               +
           (                                                                                         +
           START ('1'::double precision) END ('100'::double precision) EVERY ('20'::double precision), +
           PARTITION p1 START ('100'::double precision) END ('150'::double precision)                +
           )
(1 row)

hughcapet avatar Mar 30 '22 04:03 hughcapet

Could someone please post the failed tests' output in concourse-ci?

hughcapet avatar Mar 30 '22 10:03 hughcapet

Could someone please post the failed tests' output in concourse-ci?

It failed at gporca, which will output the detail of type conversion:

======================================================================
DIFF FILE: ../gpdb_src/src/test/regress/regression.diffs
----------------------------------------------------------------------

--- \/tmp\/build\/b235ddbb\/gpdb_src\/src\/test\/regress\/expected\/gporca_optimizer\.out	2022-03-30 05:22:30.596874178 +0000
+++ \/tmp\/build\/b235ddbb\/gpdb_src\/src\/test\/regress\/results\/gporca\.out	2022-03-30 05:22:31.312932819 +0000
@@ -13775,15 +13809,15 @@
 SELECT DISTINCT L1.c, L1.lid
 FROM t55 L1 CROSS JOIN META
 WHERE L1.lid = META.LOAD_ID;
-                                                        QUERY PLAN                                                         
----------------------------------------------------------------------------------------------------------------------------
+                                                             QUERY PLAN                                                             
+------------------------------------------------------------------------------------------------------------------------------------
                                                                      Output: true
                                                                ->  Result  (cost=0.00..0.00 rows=1 width=1)
                                                                Output: '2020-01-01', '99'
                                                          ->  Result  (cost=0.00..0.00 rows=1 width=8)
                                                          Output: load_id
                                                    ->  Result  (cost=0.00..0.00 rows=1 width=8)
-                                                   Output: load_id, pg_catalog.int4in(unknownout(load_id), ''::void, (-1))
+                                                   Output: load_id, pg_catalog.int4in(unknownout(load_id), ''::void, '-1'::integer)
                                              ->  Result  (cost=0.00..0.00 rows=1 width=8)
                                              Output: c, lid
                                              Output: load_id
@@ -13794,7 +13828,7 @@
                                        Output: load_id
                                  ->  Hash  (cost=0.00..0.00 rows=1 width=8)
                                  ->  Redistribute Motion 3:3  (slice1; segments: 3)  (cost=0.00..431.02 rows=334 width=8)
-                                 Hash Cond: (t55.lid = pg_catalog.int4in(unknownout("outer".load_id), ''::void, (-1)))
+                                 Hash Cond: (t55.lid = pg_catalog.int4in(unknownout("outer".load_id), ''::void, '-1'::integer))
                                  Output: c, lid
                            ->  Hash Join  (cost=0.00..431.08 rows=1 width=8)
                            Output: c, lid

======================================================================

SmartKeyerror avatar Mar 30 '22 13:03 SmartKeyerror

So sorry for long time no review.

@hughcapet Thanks a lot for your contribution.

Rought reading the code, seems fine. Will spend more time on this later.

kainwen avatar Aug 10 '22 14:08 kainwen

Also I'm surprised do we not have upgrade test coverage for this? @greenplum-db/upgrade

l-wang avatar Aug 16 '22 21:08 l-wang

Could you explain why do we need the second commit?

@l-wang, thank you for review. We faced inconsistency in the partition rule definition generation between every and start/stop clauses during review test changes after backporting the Postgres commit. As we found out, this inconsistency is caused by a difference in handling every clause in the preprocess_range_spec function. At the first glance, there are no reasons to defer type coercion from type casting function call to constant to later steps, because Every clause should contain only constant values. At the second hand, the signed values aren't applicable for the Every clause. So we suggest discussing unification for handling and storing as a part of this patch.

Stolb27 avatar Aug 17 '22 00:08 Stolb27

@jimmyyih please can you help to review and drive this PR to completion?

ashwinstar avatar Sep 08 '22 21:09 ashwinstar