pg_cron icon indicating copy to clipboard operation
pg_cron copied to clipboard

Hard pg_cron 1.5+ dependency on pg_catalog breaks restores of dumps created when using older extension versions

Open pcnc opened this issue 2 years ago • 17 comments

Summary

We've noticed that restoring Postgres dumps created when using older versions of pg_cron (namely 1.4.2 in our case) against a database where pg_cron 1.5+ is installed break due to a newly introduced hard, and undocumented, requirement on creating the extension in the pg_catalog schema.

Description

pg_dump generates the extension-related section of its database dumps in the following format, explicitly stating the schema where the extension was enabled at dump-time:

--
-- Name: pg_cron; Type: EXTENSION; Schema: -; Owner: -
--

CREATE EXTENSION IF NOT EXISTS pg_cron WITH SCHEMA public;

This dump was created off a Postgres install running pg_cron 1.4.2, with the extension enabled in the public schema by running CREATE EXTENSION pg_cron;. As the user's search_path was set to the public schema, the extension defaulted to it during creation.

When restoring the above dump against a database service running pg_cron 1.5+, the following error is encountered:

psql:[email protected]:203: ERROR:  extension "pg_cron" must be installed in schema "pg_catalog"

This, in turn has the following effect:

  • when allowing pg_restore to continue on error - pg_cron is not created, and any previously created cronjob does not exist post-restore
  • when using pg_restore in conjunction with --exit-on-error - restore fails

pcnc avatar Jun 01 '23 07:06 pcnc

There is a workaround with the options -l (lower case L) and -L of pg_restore:

  • First, use pg_restore -l dump > list to list all objects present in your dump into a file
  • Edit this file to remove the mentions of pg_cron creation, e.g. these two lines in my case:
    2; 3079 40049 EXTENSION - pg_cron 
    8038; 0 0 COMMENT - EXTENSION pg_cron
    
  • Run CREATE EXTENSION pg_cron; with psql
  • Use pg_restore -L list dump to restore your dump without pg_cron extension creation, after editing the list file

Exagone313 avatar Jun 02 '23 14:06 Exagone313

Just got hit by this too 😞 1.4.2 worked fine, so I'm going to stick with it.

cocowalla avatar Jun 23 '23 09:06 cocowalla

@Exagone313 - thanks for the workaround! This'll surely help out others which are impacted by the same issue.

Unfortunately when dealing with very large backups, or backups created using pg_backupall, the approach is either computationally intensive, or non viable, which renders us unable to upgrade to pg_cron 1.5+ across the platform.

@marcocitus - sorry to ping you like this, but is there any other workaround which could be implemented, if the hard requirement to create pg_cron in the pg_catalog schema is unlikely to be changed?

pcnc avatar Jun 30 '23 09:06 pcnc

Are there any updates to this issue?

mattaylor avatar Jul 27 '23 12:07 mattaylor

@pcnc do you also run into #113 ? This is a limitation in pg_dump&pg_restore that already breaks pg_cron due to the policies that the extension create.

After some discussion on #289, it feels safer to me if we keep pg_catalog as the default schema in the control file.

hanefi avatar Oct 16 '23 12:10 hanefi

I'm confused as to why https://github.com/citusdata/pg_cron/pull/289 was closed without being merged. It's a breaking and undocumented change. Can anyone shed any light on this?

philip-harvey avatar Nov 07 '23 23:11 philip-harvey

Slightly changing the output from pg_dump should allow you to work arround this issue. With something like this you should be able to automate that (untested command).

pg_dump | sed 's/^CREATE EXTENSION IF NOT EXISTS pg_cron WITH SCHEMA public;$/CREATE EXTENSION IF NOT EXISTS pg_cron;'

But then you'll likely still hit: https://github.com/citusdata/pg_cron/issues/113

JelteF avatar Nov 08 '23 08:11 JelteF

This is a breaking change for a bunch of reasons, not just for restores.

philip-harvey avatar Nov 15 '23 23:11 philip-harvey

@philip-harvey can you explain in what other ways it is a breaking change for you?

JelteF avatar Dec 11 '23 14:12 JelteF

@philip-harvey can you explain in what other ways it is a breaking change for you?

At a minimum, all code that specifies or specified a schema for pg_cron is now broken. In our case it's Terraform code. Also we have a ton of databases, and in all existing databases pg_cron is installed into a specific schema. We can now no longer create more instances that match the configuration, so we can't perform any testing now.

philip-harvey avatar Dec 11 '23 15:12 philip-harvey

@Exagone313 - thanks for the workaround! This'll surely help out others which are impacted by the same issue.

Unfortunately when dealing with very large backups, or backups created using pg_backupall, the approach is either computationally intensive, or non viable, which renders us unable to upgrade to pg_cron 1.5+ across the platform.

@marcocitus - sorry to ping you like this, but is there any other workaround which could be implemented, if the hard requirement to create pg_cron in the pg_catalog schema is unlikely to be changed?

What is the plan for this issue, please?

Is there any other workaround that could be implemented, if the hard requirement to create pg_cron in the pg_catalog schema is unlikely to be changed?

I am using the Supabase and sub-minute cron jobs are needed. Unfortunately, Supabase can not use pg_cron 1.5+ :(

idugalic avatar Jan 05 '24 12:01 idugalic

At a minimum, all code that specifies or specified a schema for pg_cron is now broken. In our case it's Terraform code.

I think that's fair.

Perhaps we should use ProcessUtility to fix CREATE EXTENSION statements that use the old schema?

marcoslot avatar Jan 08 '24 12:01 marcoslot

Are there any updates to this issue?

idugalic avatar Mar 16 '24 18:03 idugalic

Also ran into this forced pg_catalog schema issue in the pgq extension as well. This seems a very bad practice to start implementing in extensions. Looking at the sql the objects are still being created in a cron schema anyway, so this is very confusing when one does a \dx or catalog lookup because PG thinks the extension is in pg_catalog but all the objects are still in cron.

https://github.com/citusdata/pg_cron/blob/main/pg_cron.sql#L15

If you remove the code to forcefully make a schema and instead just have the schema label in the control file set to cron then it creates the desired schema by default as long as the SCHEMA flag is not set. It doesn't allow the user to specify their own schema then, but if that's the desired result, then that's fine.

https://github.com/pgq/pgq/issues/22#issuecomment-2045868135

keithf4 avatar Apr 11 '24 17:04 keithf4

Saw the other open issue talking about allowing custom schemas and the reason for not doing this is security concerns.

https://github.com/citusdata/pg_cron/issues/225

If the pg_cron code is done properly (fully schema qualify all object reference calls), all security concerns around custom schemas can be avoided. I know this because I did it in pg_partman, even correcting a very serious CVE in the past about it.

Pretty sure any user code touching pg_catalog like this would be looked down upon by core.

keithf4 avatar Apr 11 '24 17:04 keithf4

Just to show with current version, because all object creation is schema qualified to cron, the extension thinks it's installed to pg_catalog but it's all forced into cron

keith=# \dx
                                      List of installed extensions
    Name    | Version |    Schema     |                           Description                           
------------+---------+---------------+-----------------------------------------------------------------
 pg_cron    | 1.6     | pg_catalog    | Job scheduler for PostgreSQL
 pg_partman | 5.1.0   | partman       | Extension to manage partitioned tables by time or ID
 pgtap      | 1.2.0   | public        | Unit testing for PostgreSQL
 plpgsql    | 1.0     | pg_catalog    | PL/pgSQL procedural language
(5 rows)

keith=# \dx+ pg_cron
                    Objects in extension "pg_cron"
                          Object description                          
----------------------------------------------------------------------
 function cron.alter_job(bigint,text,text,text,text,boolean)
 function cron.job_cache_invalidate()
 function cron.schedule_in_database(text,text,text,text,text,boolean)
 function cron.schedule(text,text)
 function cron.schedule(text,text,text)
 function cron.unschedule(bigint)
 function cron.unschedule(text)
 schema cron
 sequence cron.jobid_seq
 sequence cron.runid_seq
 table cron.job
 table cron.job_run_details

keithf4 avatar Apr 11 '24 18:04 keithf4

FWIW I filed a bug report against PostgreSQL to fix alter extension set schema to allow the DBA to update their installed extension with the pg_catalog schema reference once v1.5+ is installed on said machine. Right now the original user-specified schema is retained and so even after upgrading pg_dump outputs now invalid and incompatible data. The workarounds are to drop extension and create it again or to manually alter the relevant catalog entry as superuser under catalog update mode.

https://www.postgresql.org/message-id/CAKFQuwa1cb9y6OW9rSmXFGH69J3ZEQOm3+Od4Ang4qi74YHcTw@mail.gmail.com

polobo avatar Apr 16 '24 03:04 polobo