citus icon indicating copy to clipboard operation
citus copied to clipboard

Trying to downgrade to 10.2 gives error

Open marcocitus opened this issue 2 years ago • 1 comments

After installing 10.2 with the install-downgrades from main, I get the following error:

postgres=# alter extension citus update;
ERROR:  cache lookup failed for pg_dist_object, called too early?
CONTEXT:  SQL statement "SELECT master_unmark_object_distributed(v_obj.classid, v_obj.objid, v_obj.objsubid)"

marcocitus avatar Jun 30 '22 20:06 marcocitus

Another error I'm hitting (using new binaries):

postgres=# alter extension citus update TO "10.2-2";
ERROR:  cannot drop constraint stripe_first_row_number_idx on table columnar.stripe because other objects depend on it
DETAIL:  access method columnar depends on index columnar.stripe_first_row_number_idx
HINT:  Use DROP ... CASCADE to drop the dependent objects too.

marcocitus avatar Jul 04 '22 09:07 marcocitus

Hello, @marcocitus Is my understanding correct that it is not possible to downgrade from 11 to 10.2 due to this bug?

ivyazmitinov avatar Sep 23 '22 07:09 ivyazmitinov

I was not able to reproduce the first error message yet, but I see the second error messages.

The second problem occurs when you upgrade to >=10.2-4 and try to downgrade to <=10.2-2.


@ivyazmitinov

Is my understanding correct that it is not possible to downgrade from 11 to 10.2 due to this bug?

This does not mean that downgrades from 11 to 10.2 are broken. We always recommend using the latest patch version. So you can safely downgrade to 10.2-5 as that version is not impacted at all.

hanefi avatar Sep 28 '22 14:09 hanefi

I have opened a PR that fixes the second error message shared in this issue.

I still fail to reproduce the first one. However, I can already think of a solution to fix this. For some reason, the cache for pg_dist_object is not ready. I can remove all references to caches from master_unmark_object_distributed and try to resolve the issue. This needs some more internal discussion before I move forward.

hanefi avatar Sep 28 '22 19:09 hanefi

After installing 10.2 with the install-downgrades from main, I get the following error:

postgres=# alter extension citus update;
ERROR:  cache lookup failed for pg_dist_object, called too early?
CONTEXT:  SQL statement "SELECT master_unmark_object_distributed(v_obj.classid, v_obj.objid, v_obj.objsubid)"

In Citus 10.2, pg_dist_object was in citus schema. In Citus 11.0, it is in pg_catalog schema. When downgrading, we alter the schema of this table from pg_catalog back to citus. However, before we perform this downgrade migration step, if the binaries for 10.2 want to access this table in the wrong schema we fail.

On Citus 11.0, we check for this relation in both schemas. Maybe we should backport this logic to 10.2 and check for the table in both schemas.

https://github.com/citusdata/citus/blob/86e186f671d05f0dce9d4fb4f27a91c182f1354f/src/backend/distributed/metadata/metadata_cache.c#L2651-L2688

hanefi avatar Oct 06 '22 12:10 hanefi