gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

Report separate PostgreSQL and Greenplum versions in pg_controldata

Open asimrp opened this issue 4 years ago • 17 comments
trafficstars

The version number reported by pg_controldata utility as pg_control version number: used to include PostgreSQL as well as Greenplum version numbers combined as a single integer. This patch separates the two version numbers. Tools such as pg_upgrade and external modules such as pg_auto_failover expect native PostgreSQL version as an integer (e.g. 833, 942, 1201, etc) in "pg_control version number". This patch attempts to match that expectation. Greenplum version is reported on a separate line in similar format (e.g. 700) as pg_control Greenplum version number.

asimrp avatar Mar 31 '21 10:03 asimrp

@gfphoenix78 this change eliminates the need to handle Greenplum-specific version within pg_auto_failover contrib module.

asimrp avatar Mar 31 '21 10:03 asimrp

PR pipeline is red because of an unrelated test failure:

--- /tmp/build/b235ddbb/gpdb_src/src/test/fsync/expected/bgwriter_checkpoint.out	2021-03-31 11:44:09.129029432 +0000
+++ /tmp/build/b235ddbb/gpdb_src/src/test/fsync/results/bgwriter_checkpoint.out	2021-03-31 11:44:09.133029776 +0000
@@ -202,7 +202,15 @@
  as (tablespace oid, database oid, relfilenode oid, block int);
  tablespace | database | relfilenode | block 
 ------------+----------+-------------+-------
-(0 rows)
+       1663 |    27608 |        1259 |     0
+       1663 |    27608 |        1259 |     0
+       1663 |    27608 |        2619 |     0
+       1663 |    27608 |        2619 |     0
+       1663 |    27608 |        2619 |     6
+       1663 |    27608 |        2619 |     6
+       1663 |    27608 |        2696 |     1
+       1663 |    27608 |        2696 |     1
+(8 rows)

asimrp avatar Mar 31 '21 17:03 asimrp

It looks cool. I'll test this patch with pg_auto_failover.

gfphoenix78 avatar Apr 01 '21 01:04 gfphoenix78

@asimrp The pg_control_version written in the control file isn't changed, right? pg_auto_failover has a function pgsql_get_postgres_metadata() that directly calls the UDF pg_control_system() to get the pg_control_version. The pg_control_version looks un-converted.

bool
pgsql_get_postgres_metadata(PGSQL *pgsql,
                            bool *pg_is_in_recovery,
                            char *pgsrSyncState, char *currentLSN,
                            PostgresControlData *control)
{
    PgMetadata context = { 0 };

    /* *INDENT-OFF* */
    char *sql =
        /*
         * Make it so that we still have the current WAL LSN even in the case
         * where there's no replication slot in use by any standby.
         *
         * When on the primary, we might have multiple standby nodes connected.
         * We're good when at least one of them is either 'sync' or 'quorum'.
         * We don't check individual replication slots, we take the "best" one
         * and report that.
         */
        "select pg_is_in_recovery(),"
        " coalesce(rep.sync_state, '') as sync_state,"
        " case when pg_is_in_recovery()"
        " then coalesce(pg_last_wal_receive_lsn(), pg_last_wal_replay_lsn())"
        " else pg_current_wal_flush_lsn()"
        " end as current_lsn,"
        " pg_control_version, catalog_version_no, system_identifier"
        " from (values(1)) as dummy"
        " full outer join"
        " (select pg_control_version, catalog_version_no, system_identifier "
        "    from pg_control_system()"
        " )"

gfphoenix78 avatar Apr 01 '21 05:04 gfphoenix78

A minimal change I come up with in pg_auto_failover is:

diff --git a/src/bin/pg_autoctl/parsing.c b/src/bin/pg_autoctl/parsing.c
index 6ff04e1..4772bc6 100644
--- a/src/bin/pg_autoctl/parsing.c
+++ b/src/bin/pg_autoctl/parsing.c
@@ -154,6 +154,8 @@ parse_controldata(PostgresControlData *pgControlData,
                log_error("Failed to parse pg_controldata output");
                return false;
        }
+       Assert(pgControlData->pg_control_version > 10000);
+       pgControlData->pg_control_version /= 10000;
        return true;
 }

diff --git a/src/bin/pg_autoctl/pgsql.c b/src/bin/pg_autoctl/pgsql.c
index b3b89b4..172c163 100644
--- a/src/bin/pg_autoctl/pgsql.c
+++ b/src/bin/pg_autoctl/pgsql.c
@@ -2368,6 +2368,8 @@ parsePgMetadata(void *ctx, PGresult *result)
                context->parsedOk = true;
                return;
        }
+       Assert(context->control.pg_control_version > 10000);
+       context->control.pg_control_version /= 10000;

        value = PQgetvalue(result, 0, 4);
        if (!stringToUInt(value, &(context->control.catalog_version_no)))

gfphoenix78 avatar Apr 01 '21 06:04 gfphoenix78

@gfphoenix78 that change to pg_auto_failover is not right, it looks very Greenplum specific. If we have to make a change to pg_auto_failover, let's think if it will be accepted by upstream. And if the answer is no, let's try to avoid the change.

Good catch about pgsql_get_postgres_metadata! I've pushed a commit to address it.

asimrp avatar Apr 02 '21 12:04 asimrp

I'm wondering if we could revert the gp change against PG_CONTROL_VERSION instead. I do not see any need of adding the gp version in PG_CONTROL_VERSION now, presuming that we would continue PG merging and bump the pg version in each major release. Also the gp version information seems to be quite useless since we could conclude the gp veresion via the pg version.

paul-guo- avatar Apr 06 '21 09:04 paul-guo-

I'm wondering if we could revert the gp change against PG_CONTROL_VERSION instead. I do not see any need of adding the gp version in PG_CONTROL_VERSION now, presuming that we would continue PG merging and bump the pg version in each major release. Also the gp version information seems to be quite useless since we could conclude the gp veresion via the pg version.

I prefer to split the current pg_control_version and keep the semantics of pg_control_version aligned to the upstream.

gfphoenix78 avatar Apr 07 '21 00:04 gfphoenix78

I had tried reverting PG_CONTROL_VERSION macro to upstream version and having a separate GP_CONTROL_VERSION. That patch seemed a lot more intrusive that the current proposal. If you guys feel strongly about it, I can make another attempt.

asimrp avatar Apr 07 '21 12:04 asimrp

I had tried reverting PG_CONTROL_VERSION macro to upstream version and having a separate GP_CONTROL_VERSION. That patch seemed a lot more intrusive that the current proposal. If you guys feel strongly about it, I can make another attempt.

I meant we do not need GP_CONTROL_VERSION, just use upstream PG_CONTROL_VERSION. Is this feasible or reasonable?

paul-guo- avatar Apr 07 '21 12:04 paul-guo-

I had tried reverting PG_CONTROL_VERSION macro to upstream version and having a separate GP_CONTROL_VERSION. That patch seemed a lot more intrusive that the current proposal. If you guys feel strongly about it, I can make another attempt.

@asimrp Other guys may feel a bit confused about the pg_control_version: it contains two version fields in the control file, but pg_controldata/pg_control_system().pg_control_version returns only one part of the fields. I have no idea about the effort to split this value, please ignore me if it takes too much effort.

I meant we do not need GP_CONTROL_VERSION, just use upstream PG_CONTROL_VERSION. Is this feasible or reasonable?

It's probably up to whether a major GPDB version will have Postgres with a higher version. If not, we may need to have an additional control version for GP.

BTW, will the pg_control_version change if we accidentally have an incompatible GPDB in the minor upgrade?

gfphoenix78 avatar Apr 07 '21 15:04 gfphoenix78

I checked the code and asked the PG guys offline. PG_CONTROL_VERSION is actually the version number of the pgcontrol format so it could be same cross major releases in PG. I'm a bit confused now about how current gp definition of PG_CONTROL_VERSION (e.g. 12010700) affects the applications, Can you please elaborate?

paul-guo- avatar Apr 09 '21 03:04 paul-guo-

It's used to validate control version numbers. The server-side code and pg_rewind/pg_resetwal/pg_checksums can only use the control data file that is matched to its compiled version number. So, we can't use pg_rewind/pg_resetwal/pg_checksums with higher/lower versions to operate the server to prevent potential corruptions.

We don't extract the GP version from PG_CONTROL_VERSION, we use it as a whole. It's fine for GPDB to remove the GP version from PG_CONTROL_VERSION if we guarantee that the PG_CONTROL_VERSIONs are different among major GPDB releases. It's easy to have this promise. If the next major GPDB release has the same major PG version,

  1. It probably means that the pg_rewind/pg_resetwal/pg_checksums are still usable.
  2. To be safe, we may manually bump the minor version in PG_CONTROL_VERSION.

gfphoenix78 avatar Apr 09 '21 05:04 gfphoenix78

It's fine for GPDB to remove the GP version from PG_CONTROL_VERSION if we guarantee that the PG_CONTROL_VERSIONs are different among major GPDB releases. It's easy to have this promise. If the next major GPDB release has the same major PG version,

1. It probably means that the pg_rewind/pg_resetwal/pg_checksums are still usable.

2. To be safe, we may manually bump the minor version in PG_CONTROL_VERSION.

If the users use those tools that are shipped in the corresponding releases we do not need to worry about the version mismatch, right? I believe we'd ship the modified pg_auto_failover tool along with gpdb in the gp7 release, right?

paul-guo- avatar Apr 09 '21 06:04 paul-guo-

I believe we'd ship the modified pg_auto_failover tool along with gpdb in the gp7 release, right?

We should try our best to use native pg_auto_failover and NOT to fork it at all!

@paul-guo- thank you obtaining the confirmation from experts regarding PG_CONTROL_VERSION. Let you change this patch as you and Hao Wu are suggesting.

asimrp avatar Apr 09 '21 07:04 asimrp

Edit: let me change this patch, not you :)

asimrp avatar Apr 12 '21 07:04 asimrp

I feel in-general this change is good and we should pursue it for 7X.

ashwinstar avatar Sep 02 '22 19:09 ashwinstar

Closing this PR for now given the need for same is not realized so far. Given its not on-disk change if really requirement comes up again we can make the required change. It's hard to promise our pg_controlfile struct is same as upstream as there are GPDB added fields (like nextRelfilenode) to CHECKPOINT struct which is part of pgcontroldata, hence better to have GP version to convey the difference as the struct is not binary compatible with upstream control struct.

ashwinstar avatar Oct 26 '22 16:10 ashwinstar