pg-clone-schema
pg-clone-schema copied to clipboard
Multiple issues cloning views
Hi,
Found some problems when running pg-clone-schema on my project:
- it does not handle the case where a normal view depends on a materialized view or vice-versa
- views which have indirect dependencies ('i') are not enumerated at all
CREATE OR REPLACEcannot be used with materialized views
I made the script 'work' on my schema. But I had to comment out the whole 'Create Materialized views' part - probably loosing a lot of needed functionality. I don't know how to fix it properly.
I'm attaching the modified script. Please compare ignoring white space changes, I messed it up. It was based on rev 164279a.
clone_schema_mod.sql.gz
Note: I removed the 'Create Materialized views' because the materialized view is created in the same place as are normal views. If i don't comment it out, that code will try to create it again and fail because the materialized view already exists.
@samodadela I think I have fixed it. Please try the latest code. Not understanding the 2nd bullet about some views are not being enumerated at all. So you are saying some view defs in the source schema are not being cloned?
Thanks @MichaelDBA I'll give it a spin.
Regarding the 2nd bullet, I was referring to this change:
Some of my views have deptype 'i' and so are missed by pg-clone-schema.
I tested the latest version.
I had to change the code as in my latest post (add IN ('n', 'i'), line 2819). Then the cloning of the schema works using 'NODATA'.
If I clone the schema using 'DATA' it fails, complaining that the materialized view does not exist. Here are the logs:
db.public> select clone_schema('t_src', 't_clone_schema_data', 'DATA', 'VERBOSE')
clone_schema version 2.2 March 05, 2024
Linux: PostgreSQL 16.4 (Ubuntu 16.4-0ubuntu0.24.04.2) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 13.2.0-23ubuntu4) 13.2.0, 64-bit
COLLATIONS cloned: 0
DOMAINS cloned: 0
TYPES cloned: 2
SEQUENCES cloned: 154
TABLES cloned: 246
[2024-10-07 10:35:09] [P0001] ERROR: Version: 2.2 March 05, 2024 Action: Mat. Views SearchPath: public oldSP=public newSP=t_src Diagnostics: line=PL/pgSQL function clone_schema(text,text,cloneparms[]) line 2029 at EXECUTE. 42P01. relation "t_clone_schema_data.materialized_view" does not exist
[2024-10-07 10:35:09] Where: PL/pgSQL function clone_schema(text,text,cloneparms[]) line 2965 at RAISE
SEQUENCES set: 154
IDENTITIES set: 0
FUNCTIONS cloned: 1
dependent view count=0 for view, view1
dependent view count=0 for view, view2
...
dependent view count=0 for view, view22
VIEWS cloned: 0
MAT VIEWS cloned: 1
Validated it is fixed for NODATA, but still needs to be fixed for DATA case...
@samodadela , please try the latest code, think i got it fixed for NODATA and DATA cases...
@MichaelDBA , I still had to change line 2819... can you take a look and incorporate that change if it makes sense?
AND d.deptype in ( 'n', 'i')
After that I tested NODATA: works.
And then DATA: works, too!
Populated cloned table, t_clone_schema_data.very_long_table_nam Rows Copied: 110 seconds: 0
Populated Mat. View, t_clone_schema_data.materialized_view01 Rows Inserted: ? seconds: 1319
Copy rows duration: 1367 seconds
TABLES copied: 246
MATVIEWS refreshed: 1
Deferrd VIEWS cloned: 1
TABLE PRIVS cloned: 1883
FKEYS cloned: 277
TRIGGERS cloned: 0
clone_schema duration: 1381 seconds
Thanks!
When comparing the original and cloned schema I noticed two more differences - an index and a sequence differ. Will report back when I find what's going on.
@samodadela , Thanks for the catch, please try the latest code.
@MichaelDBA Thanks for adding the 'i' - works now.
I have two minor problems.
My migrations contain something like this:
create table f_hub
(
id varchar(255) not null
constraint f_hub_id_key
unique,
...
)
create unique index xufhub
on f_hub (id);
pg-clone-schema clones it as:
create table t_clone_schema.f_hub
(
id varchar(255) not null
constraint xufhub
unique,
...
)
create unique index f_hub_id_key
on t_clone_schema.f_hub (id);
So it somehow swapped he name of the constraint and the index.
The other issue is with a sequence. pg-clone-schema creates an additional sequence, which doesn't exist in the original. This happens only for one sequence. Don't know if it matters, but that is the last sequence sorted alphabetically.
original:
-- auto-generated definition
create sequence w_locations_id_seq
as integer;
alter sequence w_locations_id_seq owner to dbuser;
alter sequence w_locations_id_seq owned by w_locations.id;
clone:
-- auto-generated definition
create sequence w_locations_id_seq
as integer;
alter sequence w_locations_id_seq owner to dbuser;
alter sequence w_locations_id_seq owned by w_locations.id;
-- auto-generated definition
create sequence w_locations_id_seq1
as integer;
alter sequence w_locations_id_seq1 owner to dbuser;
So w_locations_id_seq1 does not exist in the original.
I am not able to duplicate the swapping of index name with constraint name. Here is my test case: create schema test;
set search_path = 'test';
create table test.f_hub (id varchar(255) not null constraint f_hub_id_key unique, astring text);
create unique index xufhub on test.f_hub (id);
select clone_schema('test', 'clone1', 'NODATA');
Here is clone output with "\d clone1.*
Table "clone1.f_hub"
Column | Type | Collation | Nullable | Default
---------+------------------------+-----------+----------+---------
id | character varying(255) | | not null |
astring | text | | |
Indexes:
"f_hub_id_key" UNIQUE CONSTRAINT, btree (id)
"xufhub" UNIQUE, btree (id)
Index "clone1.f_hub_id_key"
Column | Type | Key? | Definition
--------+------------------------+------+------------
id | character varying(255) | yes | id
unique, btree, for table "clone1.f_hub"
Index "clone1.xufhub"
Column | Type | Key? | Definition
--------+------------------------+------+------------
id | character varying(255) | yes | id
unique, btree, for table "clone1.f_hub"
With respect to sequence number problem, I an unable to duplicate in using PG version 16.
Hi @MichaelDBA , sorry for the bad reproducers.
Here's the one for the sequence name:
drop schema if exists t_clone_schema cascade;
drop schema if exists t_test cascade;
create schema t_test;
create table if not exists t_test.w_locations
(
id serial primary key,
geometry geometry(Point, 4326) not null
);
create index if not exists idx_wl_location_geometry
on t_test.w_locations using gist (geometry);
The create index is required to reproduce the problem.
select clone_schema('t_test', 't_clone_schema', 'NODATA');
I'll try to create a minimal reproducer for the other one later.
Here's the reproducer for swapping of index name with constraint name:
drop schema if exists t_clone_schema cascade;
drop schema if exists t_test cascade;
create schema if not exists t_test;
create table if not exists t_test.f_f_hub
(
id varchar(255) not null
constraint f_hub_id_key unique
);
create unique index if not exists xufhub
on t_test.f_f_hub (id);
select clone_schema('t_test', 't_clone_schema', 'NODATA','VERBOSE');
@samodadela , I finally got around to fixing the constraint name problem. It brought up other areas I needed to fix as well. Will get to the sequence problem shortly... Let me know how this part is doing now though. Thanks in advance.
@MichaelDBA , I pulled the latest code. Now I see some differences that were not there before. Private keys get renamed. For example flyway_schema_history_pk will get renamed to flyway_schema_history_pkey. Also the cloned DDL is quite different from the original:
original:
create table t_test.flyway_schema_history
(
installed_rank integer not null
constraint flyway_schema_history_pk
primary key
);
cloned:
create table t_clone_schema.flyway_schema_history
(
installed_rank integer not null
primary key
);
Here's a reproducer:
drop schema if exists t_clone_schema cascade;
drop schema if exists t_test cascade;
create schema if not exists t_test;
create table t_test.flyway_schema_history
(
installed_rank integer not null
constraint flyway_schema_history_pk
primary key
);
select clone_schema('t_test', 't_clone_schema', 'NODATA', 'VERBOSE');
Fixed the primary key naming problem, but I don't see how the syntax changed for the primary key def in the clone.
\d t_clone_schema.flyway_schema_history
Table "t_clone_schema.flyway_schema_history"
Column | Type | Collation | Nullable | Default
----------------+---------+-----------+----------+---------
installed_rank | integer | | not null |
Indexes:
"flyway_schema_history_pk" PRIMARY KEY, btree (installed_rank)
and using pg_get_tabledef() returns what you want:
select * from pg_get_tabledef('t_clone_schema','flyway_schema_history', false);
pg_get_tabledef
--------------------------------------------------------------------
CREATE TABLE t_clone_schema.flyway_schema_history (
installed_rank integer NOT NULL,
CONSTRAINT flyway_schema_history_pk PRIMARY KEY (installed_rank)
) TABLESPACE pg_default;
The whole point about using the constraint syntax with primary keys is to have control over the name of the primary key and not let PG use its default naming convention. But the index behind the primary key is defined correctly with the constraint name.
Regarding the sequences, I count the sequences in the source and target after the clone and they are the same count.
SELECT count(*) as sequences FROM pg_class c, pg_namespace n where n.oid = c.relnamespace and c.relkind = 'S' and n.nspname = '<source schema>';
SELECT count(*) as sequences FROM pg_class c, pg_namespace n where n.oid = c.relnamespace and c.relkind = 'S' and n.nspname = '<target schema>';
I compared the output you got from pg_table_def() and it's not 100% the same. The syntax is a bit different. There are two items separated with a comma from pg_table_def() and the name of the constraint in brackets; while my original reproducer has only one 'sentence' (no comma). Here are both for comparison:
drop schema if exists t_clone_schema cascade;
drop schema if exists t_test cascade;
create schema if not exists t_test;
-- create table t_test.flyway_schema_history
-- (
-- installed_rank integer not null
-- constraint flyway_schema_history_pk
-- primary key
-- );
CREATE TABLE t_test.flyway_schema_history
(
installed_rank integer NOT NULL,
CONSTRAINT flyway_schema_history_pk PRIMARY KEY (installed_rank)
);
select clone_schema('t_test', 't_clone_schema', 'NODATA', 'VERBOSE');
If I use this new reproducer (using the definition from get_table_def()) - then there are no differences in the original and the clonse.
Digging some more it seems that both syntax's have the same effect. One sets the constraint in-line with the column, the other (get_table_def()) set it on the level of the whole table.
The problem is that the schemas do not compare so it's easy to miss other differences.
Can you add the output of get_table_def() for t_test.flyway_schema_history (the original)? I mean is the change in syntax caused by the cloning process or is get_table_def() that formats it that way?
Here is the complete output to it and I don't see any diffs. I am using PG v16
clone_testing=# drop schema if exists t_clone_schema cascade;
create schema if not exists t_test;
create table t_test.flyway_schema_history (installed_rank integer not null constraint flyway_schema_history_pk primary key);
select clone_schema('t_test', 't_clone_schema', 'NODATA', 'VERBOSE');
select * from pg_get_tabledef('t_test','flyway_schema_history', false);
select * from pg_get_tabledef('t_clone_schema','flyway_schema_history', false);
\d t_test.flyway_schema_history
\d t_clone_schema.flyway_schema_historyNOTICE: schema "t_clone_schema" does not exist, skipping
DROP SCHEMA
clone_testing=# drop schema if exists t_test cascade;
NOTICE: schema "t_test" does not exist, skipping
DROP SCHEMA
clone_testing=# create schema if not exists t_test;
CREATE SCHEMA
clone_testing=# create table t_test.flyway_schema_history (installed_rank integer not null constraint flyway_schema_history_pk primary key);
CREATE TABLE
clone_testing=# select clone_schema('t_test', 't_clone_schema', 'NODATA', 'VERBOSE');
NOTICE: clone_schema version 2.3 October 25, 2024
NOTICE: Linux: PostgreSQL 16.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-22), 64-bit
NOTICE: COLLATIONS cloned: 0
NOTICE: DOMAINS cloned: 6
NOTICE: TYPES cloned: 2
NOTICE: SEQUENCES cloned: 0
NOTICE: TABLES cloned: 1
NOTICE: SEQUENCES set: 0
NOTICE: IDENTITIES set: 0
NOTICE: FUNCTIONS cloned: 0
NOTICE: VIEWS cloned: 0
NOTICE: MAT VIEWS cloned: 0
NOTICE: Deferrd VIEWS cloned: 0
NOTICE: RULES cloned: 0
NOTICE: POLICIES cloned: 0
NOTICE: COMMENTS(1) cloned: 0
NOTICE: COMMENTS(2) cloned: 0
NOTICE: DFLT PRIVS cloned: 0
NOTICE: SCHEMA PRIVS cloned: 0
NOTICE: SEQ. PRIVS cloned: 0
NOTICE: FUNC PRIVS cloned: 0
NOTICE: TABLE PRIVS cloned: 7
NOTICE: FKEYS cloned: 0
NOTICE: TRIGGERS cloned: 0
NOTICE: TABLES copied: 0
NOTICE: MATVIEWS refreshed: 0
NOTICE: clone_schema duration: 1 seconds
clone_schema
--------------
(1 row)
clone_testing=# select * from pg_get_tabledef('t_test','flyway_schema_history', false);
pg_get_tabledef
--------------------------------------------------------------------
CREATE TABLE t_test.flyway_schema_history ( +
installed_rank integer NOT NULL, +
CONSTRAINT flyway_schema_history_pk PRIMARY KEY (installed_rank)+
) TABLESPACE pg_default; +
clone_testing=# select * from pg_get_tabledef('t_clone_schema','flyway_schema_history', false);
pg_get_tabledef
--------------------------------------------------------------------
CREATE TABLE t_clone_schema.flyway_schema_history ( +
installed_rank integer NOT NULL, +
CONSTRAINT flyway_schema_history_pk PRIMARY KEY (installed_rank)+
) TABLESPACE pg_default; +
clone_testing=# \d t_test.flyway_schema_history
Table "t_test.flyway_schema_history"
Column | Type | Collation | Nullable | Default
----------------+---------+-----------+----------+---------
installed_rank | integer | | not null |
Indexes:
"flyway_schema_history_pk" PRIMARY KEY, btree (installed_rank)
clone_testing=# \d t_clone_schema.flyway_schema_history
Table "t_clone_schema.flyway_schema_history"
Column | Type | Collation | Nullable | Default
----------------+---------+-----------+----------+---------
installed_rank | integer | | not null |
Indexes:
"flyway_schema_history_pk" PRIMARY KEY, btree (installed_rank)
@samodadela comment???
@MichaelDBA Sorry for the late reply. I retested this again and you are right, the constraint is cloned correctly regardless of the syntax it was created with. I'm also using PG 16 (16.4).
Here's how I tested (created both at the same time):
drop schema if exists t_clone_schema cascade;
drop schema if exists t_test cascade;
create schema if not exists t_test;
create table t_test.flyway_schema_history_1
(
installed_rank integer not null
constraint flyway_schema_history_1_pk
primary key
);
CREATE TABLE t_test.flyway_schema_history_2
(
installed_rank integer NOT NULL,
CONSTRAINT flyway_schema_history_2_pk PRIMARY KEY (installed_rank)
);
select clone_schema('t_test', 't_clone_schema', 'NODATA', 'VERBOSE');
I'm still seeing some differences - I'll post a reproducer when I have it.
I'm sorry this is dragging for so long.
@MichaelDBA
Here's another difference reproducer. Don't know if I reported this one before - I lost track off all the changes.
drop schema if exists t_clone_schema cascade;
drop schema if exists t_test cascade;
create schema if not exists t_test;
create table t_test.eos
(
id bigint not null primary key,
session_id uuid constraint uk_nyq unique
);
select clone_schema('t_test', 't_clone_schema', 'NODATA', 'VERBOSE');
Original DDL:
create table t_test.eos
(
id bigint not null primary key,
session_id uuid constraint uk_nyq unique
);
Clone:
create table t_clone_schema.eos
(
id bigint not null primary key,
session_id uuid unique
);
I am not able to reproduce what you are showing:
clone_testing=# drop schema if exists t_clone_schema cascade;
NOTICE: schema "t_clone_schema" does not exist, skipping
DROP SCHEMA
clone_testing=# drop schema if exists t_test cascade;
NOTICE: schema "t_test" does not exist, skipping
DROP SCHEMA
clone_testing=# create schema if not exists t_test;
CREATE SCHEMA
clone_testing=# create table t_test.eos (id bigint not null primary key, session_id uuid constraint uk_nyq unique);
CREATE TABLE
clone_testing=# select clone_schema('t_test', 't_clone_schema', 'NODATA', 'VERBOSE');
NOTICE: clone_schema version 2.6 November 06, 2024
NOTICE: Linux: PostgreSQL 16.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-22), 64-bit
NOTICE: COLLATIONS cloned: 0
NOTICE: DOMAINS cloned: 6
NOTICE: TYPES cloned: 2
NOTICE: SEQUENCES cloned: 0
NOTICE: TABLES cloned: 1
NOTICE: SEQUENCES set: 0
NOTICE: IDENTITIES set: 0
NOTICE: FUNCTIONS cloned: 0
NOTICE: VIEWS cloned: 0
NOTICE: MAT VIEWS cloned: 0
NOTICE: Deferrd VIEWS cloned: 0
NOTICE: RULES cloned: 0
NOTICE: POLICIES cloned: 0
NOTICE: COMMENTS(1) cloned: 0
NOTICE: COMMENTS(2) cloned: 0
NOTICE: DFLT PRIVS cloned: 0
NOTICE: SCHEMA PRIVS cloned: 0
NOTICE: SEQ. PRIVS cloned: 0
NOTICE: FUNC PRIVS cloned: 0
NOTICE: TABLE PRIVS cloned: 7
NOTICE: FKEYS cloned: 0
NOTICE: TRIGGERS cloned: 0
NOTICE: TABLES copied: 0
NOTICE: MATVIEWS refreshed: 0
NOTICE: clone_schema duration: 0 seconds
clone_schema
--------------
0
(1 row)
clone_testing=# \d t_test.eos
Table "t_test.eos"
Column | Type | Collation | Nullable | Default
------------+--------+-----------+----------+---------
id | bigint | | not null |
session_id | uuid | | |
Indexes:
"eos_pkey" PRIMARY KEY, btree (id)
"uk_nyq" UNIQUE CONSTRAINT, btree (session_id)
clone_testing=# \d t_clone_schema.eos
Table "t_clone_schema.eos"
Column | Type | Collation | Nullable | Default
------------+--------+-----------+----------+---------
id | bigint | | not null |
session_id | uuid | | |
Indexes:
"eos_pkey" PRIMARY KEY, btree (id)
"eos_session_id_key" UNIQUE CONSTRAINT, btree (session_id)
Also, the only diff is that the constraint name is different sometimes due to the way some tables get cloned. The following is a bullet from the LIMITATIONS section in the README:
Index names are not all the same in the cloned schema since some of the tables are created with the CREATE TABLE ... (LIKE ...) construct. Those index names are automatically fabricated by PG with naming format that is prepended with table and column names separated by underscores and ending with "_idx" or "_key".
@samodadela , any response before i close this?
I get the DDLs using IntelliJ IDEA Ultimate. I'm selecting two schemas and comparing them. Maybe that's why I'm seeing those diffs (they have a different way of generating the DDL).
Anyway because of the limitation the schemas won't be exactly the same. Problem is I have to wade through 50+ differences and check that there are any other issues. A possible workaround could be to rename the indexes to their original names after they are created?
I think I have bothered you enough. Thanks for your help and patience.
Thanks for pointing out stuff that we did fix with this ticket. Closing it now.