gpdb
gpdb copied to clipboard
AO/CO tables with dropped columns cannot be reliably COPY'd after upgrade
Greenplum version or build
master
branch; likely applies to 4->6 and 5->6 upgrade cases as well
Problematic behavior
After a column is dropped from an AO/CO table and the table is run through pg_upgrade, the resulting table in the new cluster can't be COPY
'd if the column encoding of the dropped column doesn't match the default (e.g. it uses zlib
compression, etc.).
Steps to reproduce the behavior
make create-demo-cluster
# Load an AO/CO table with dropped columns
createdb regression
psql regression <<EOF
CREATE TABLE test(
a int,
b int ENCODING (compresstype=zlib,compresslevel=1),
c int
) WITH (appendonly=true, orientation=column);
INSERT INTO test VALUES (1, 2, 3), (4, 5, 6), (7, 8, 9);
ALTER TABLE test DROP COLUMN b;
EOF
# Run the upgrade test
createdb isolation2test
make -C contrib/pg_upgrade installcheck
This breaks during the final pg_dump call with
pg_dump: Dumping the contents of table "test" failed: PQgetResult() failed.
pg_dump: Error message from server: ERROR: decompression information missing (cdbappendonlystorageread.c:1417) (cdbappendonlystorageread.c:1417)
pg_dump: The command was: COPY public.test (a, c) TO stdout IGNORE EXTERNAL PARTITIONS;
Why?
When pg_upgrade recreates the table's schema, it adds the dropped column and re-drops it so that pg_attribute
and other pieces of the catalog understand that the data on disk has a hole where the column used to be. However, it doesn't add the column's encoding for AO/CO tables, so if the on-disk data is actually compressed, the COPY
code doesn't find any decompression information and bails out.
The problem gets worse with partitioned tables. The dropped column's encoding is also missing from the partition specifications -- but pg_dump can't fix that by itself, because the SQL for creating the partitioned table comes from the old cluster directly.
(Why the COPY
code needs to actually decompress a dropped column is left for another exercise, but it seems like the schema should match what's on disk in any case...?)
(Why the COPY code needs to actually decompress a dropped column is left for another exercise, but it seems like the schema should match what's on disk in any case...?)
This is a fallout of row oriented storage. Dropping a column of a table is a lightweight operation. Simply marking the column as dropped in the catalog, without changing the table's data files. Subsequent scan/update operations on the table must, therefore, interpret dropped columns and skip over them when constructing tuples from data files. This works well for row-oriented storage. Column-oriented storage offers better opportunity for dropping a column - the entire file representing the column can be removed from disk. This opportunity is not currently leveraged because the corresponding changes needed layers above storage layer (e.g. adjusting attribute numbers after drop so that they are consecutive) are not taken care of.
@l-wang and @jmcatamney and I were looking into changing copy so as to avoid scanning dropped columns but couldn't get it to work. I got a chance to dig more into and this patch seems to fix it.
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index dd40a01f312..5a28b8369ff 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -2842,16 +2842,25 @@ CopyTo(CopyState cstate)
bool *proj = NULL;
int nvp = tupDesc->natts;
- int i;
if (tupDesc->tdhasoid)
{
elog(ERROR, "OIDS=TRUE is not allowed on tables that use column-oriented storage. Use OIDS=FALSE");
}
- proj = palloc(sizeof(bool) * nvp);
- for(i = 0; i < nvp; ++i)
- proj[i] = true;
+ proj = palloc0(sizeof(bool) * nvp);
+ /*
+ * attnumlist is constructed after filtering out dropped
+ * columns. Use it to mark columns to be projected so as to
+ * avoid reading (and decompressing) dropped column data from
+ * disk.
+ */
+ foreach(cur, cstate->attnumlist)
+ {
+ int attnum = lfirst_int(cur);
+ Assert(attnum <= nvp);
+ proj[attnum-1] = true;
+ }
scan = aocs_beginscan(rel, GetActiveSnapshot(),
GetActiveSnapshot(),
Note that SELECT
and statements other than COPY
get rid of drooped columns during parsing (see parse_target.c:ExpandRowReference()
).
Is this a problem because we are dumping and restoring then upgrading. In otherwords, do we know if this would be a problem if we were trying to upgrade the original database without trying to move the stuff onto a different environment?
@Eulerizeit This currently only shows up if you do not dump and restore. It's not visible in the master pipeline's pg_upgrade
test job because, once you dump and restore, the "hole" in the data that exposes the bug is gone.
@melanieplageman you may want to keep eyes on this
Confirming problem still exists in lastest GPDB6. We should explore the fix proposed above by Asim.
I've confirmed the above patch resolves the issue in the repro steps on 6X_STABLE head and now correctly fetches only requested columns in a basic COPY aoco (i) to '/foo' test
.
This issue is resolved by the following PRs: Mark columns to be projected for COPY TO on AOCO table [6X]: In binary upgrade, dump the encoding clause for dropped columns 6X: Don't scan dropped columns during ANALYZE for AOCO tables
Tests have been added to gpupgrade. Test COPY TO/COPY FROM/ANALYZE on AOCO tables with a dropped column post upgrade