gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

AO/CO tables with dropped columns cannot be reliably COPY'd after upgrade

Open jchampio opened this issue 6 years ago • 5 comments

Greenplum version or build

master branch; likely applies to 4->6 and 5->6 upgrade cases as well

Problematic behavior

After a column is dropped from an AO/CO table and the table is run through pg_upgrade, the resulting table in the new cluster can't be COPY'd if the column encoding of the dropped column doesn't match the default (e.g. it uses zlib compression, etc.).

Steps to reproduce the behavior

make create-demo-cluster

# Load an AO/CO table with dropped columns
createdb regression
psql regression <<EOF
    CREATE TABLE test(
        a int,
        b int ENCODING (compresstype=zlib,compresslevel=1),
        c int
    ) WITH (appendonly=true, orientation=column);

    INSERT INTO test VALUES (1, 2, 3), (4, 5, 6), (7, 8, 9);
    ALTER TABLE test DROP COLUMN b;
EOF

# Run the upgrade test
createdb isolation2test
make -C contrib/pg_upgrade installcheck

This breaks during the final pg_dump call with

pg_dump: Dumping the contents of table "test" failed: PQgetResult() failed.
pg_dump: Error message from server: ERROR:  decompression information missing (cdbappendonlystorageread.c:1417) (cdbappendonlystorageread.c:1417)
pg_dump: The command was: COPY public.test (a, c) TO stdout IGNORE EXTERNAL PARTITIONS;

Why?

When pg_upgrade recreates the table's schema, it adds the dropped column and re-drops it so that pg_attribute and other pieces of the catalog understand that the data on disk has a hole where the column used to be. However, it doesn't add the column's encoding for AO/CO tables, so if the on-disk data is actually compressed, the COPY code doesn't find any decompression information and bails out.

The problem gets worse with partitioned tables. The dropped column's encoding is also missing from the partition specifications -- but pg_dump can't fix that by itself, because the SQL for creating the partitioned table comes from the old cluster directly.

(Why the COPY code needs to actually decompress a dropped column is left for another exercise, but it seems like the schema should match what's on disk in any case...?)

jchampio avatar Nov 09 '18 00:11 jchampio

(Why the COPY code needs to actually decompress a dropped column is left for another exercise, but it seems like the schema should match what's on disk in any case...?)

This is a fallout of row oriented storage. Dropping a column of a table is a lightweight operation. Simply marking the column as dropped in the catalog, without changing the table's data files. Subsequent scan/update operations on the table must, therefore, interpret dropped columns and skip over them when constructing tuples from data files. This works well for row-oriented storage. Column-oriented storage offers better opportunity for dropping a column - the entire file representing the column can be removed from disk. This opportunity is not currently leveraged because the corresponding changes needed layers above storage layer (e.g. adjusting attribute numbers after drop so that they are consecutive) are not taken care of.

@l-wang and @jmcatamney and I were looking into changing copy so as to avoid scanning dropped columns but couldn't get it to work. I got a chance to dig more into and this patch seems to fix it.

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index dd40a01f312..5a28b8369ff 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2842,16 +2842,25 @@ CopyTo(CopyState cstate)
                                bool *proj = NULL;
 
                                int nvp = tupDesc->natts;
-                               int i;
 
                                if (tupDesc->tdhasoid)
                                {
                                    elog(ERROR, "OIDS=TRUE is not allowed on tables that use column-oriented storage. Use OIDS=FALSE");
                                }
 
-                               proj = palloc(sizeof(bool) * nvp);
-                               for(i = 0; i < nvp; ++i)
-                                   proj[i] = true;
+                               proj = palloc0(sizeof(bool) * nvp);
+                               /*
+                                * attnumlist is constructed after filtering out dropped
+                                * columns.  Use it to mark columns to be projected so as to
+                                * avoid reading (and decompressing) dropped column data from
+                                * disk.
+                                */
+                               foreach(cur, cstate->attnumlist)
+                               {
+                                       int                     attnum = lfirst_int(cur);
+                                       Assert(attnum <= nvp);
+                                       proj[attnum-1] = true;
+                               }
 
                                scan = aocs_beginscan(rel, GetActiveSnapshot(),
                                                                          GetActiveSnapshot(),

Note that SELECT and statements other than COPY get rid of drooped columns during parsing (see parse_target.c:ExpandRowReference()).

asimrp avatar Nov 09 '18 20:11 asimrp

Is this a problem because we are dumping and restoring then upgrading. In otherwords, do we know if this would be a problem if we were trying to upgrade the original database without trying to move the stuff onto a different environment?

Eulerizeit avatar Nov 13 '18 20:11 Eulerizeit

@Eulerizeit This currently only shows up if you do not dump and restore. It's not visible in the master pipeline's pg_upgrade test job because, once you dump and restore, the "hole" in the data that exposes the bug is gone.

jchampio avatar Nov 14 '18 16:11 jchampio

@melanieplageman you may want to keep eyes on this

skahler-vmware avatar Dec 21 '18 00:12 skahler-vmware

Confirming problem still exists in lastest GPDB6. We should explore the fix proposed above by Asim.

ashwinstar avatar Feb 14 '22 18:02 ashwinstar

I've confirmed the above patch resolves the issue in the repro steps on 6X_STABLE head and now correctly fetches only requested columns in a basic COPY aoco (i) to '/foo' test.

bmdoil avatar Jan 19 '23 23:01 bmdoil