pgcopydb icon indicating copy to clipboard operation
pgcopydb copied to clipboard

Refresh materialized views fails during post data pg_restore phase

Open VaibhaveS opened this issue 1 year ago • 3 comments

REFRESH MATERIALIZED VIEWS fail when there is a reference to a object which is not fully qualified and the query is enclosed in a string.

Query to reproduce

CREATE SCHEMA pgcopydb;

CREATE TABLE pgcopydb.test ("Name" text,  "LocalTag" text);

CREATE MATERIALIZED VIEW public.test_view AS
SELECT row_number() OVER (PARTITION BY true::boolean) AS row_number,
    ts_stat.word
   FROM ts_stat('SELECT to_tsvector(''simple'', "Name") ||
                 to_tsvector(''simple'', coalesce("LocalTag",'' ''))
            FROM test'::text) ts_stat(word, ndoc, nentry)
  WITH NO DATA;

SET search_path = 'pgcopydb';

REFRESH MATERIALIZED VIEW public.test_view;

Logs

19:26:42 43355 INFO    /usr/bin/pg_restore --dbname 'postgres://[email protected]/postgres?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60' --single-transaction --clean --if-exists --use-list /tmp/pgcopydb/schema/post-filtered.list /tmp/pgcopydb/schema/post.dump
19:26:48 43355 ERROR  pg_restore: while PROCESSING TOC:
19:26:48 43355 ERROR  pg_restore: from TOC entry 4279; 0 139077 MATERIALIZED VIEW DATA test_view azureuser
19:26:48 43355 ERROR  pg_restore: error: could not execute query: ERROR:  relation "test" does not exist
19:26:48 43355 ERROR  LINE 3:             FROM test
19:26:48 43355 ERROR                           ^
19:26:48 43355 ERROR  QUERY:  SELECT to_tsvector('simple', "Name") ||
19:26:48 43355 ERROR                   to_tsvector('simple', coalesce("LocalTag",' '))
19:26:48 43355 ERROR              FROM test
19:26:48 43355 ERROR  Command was: REFRESH MATERIALIZED VIEW public.test_view;
19:26:48 43355 ERROR  Failed to run pg_restore: exit code 1
19:26:48 43355 ERROR  Failed to clone source database, see above for details
19:26:48 43353 ERROR  clone process 43355 has terminated [6]

VaibhaveS avatar Sep 30 '23 13:09 VaibhaveS

First of all, thanks @dimitri for this great tool!

I ran into a similar problem when migrating from postgres 15.3 to 16.3. I used this simple script:

SOURCE_DB="..."
TARGET_DB="..."
DATA_DIR="/mnt/data/pgcopydb2"  # path where I have some free space
mkdir -p "$DATA_DIR"
chown -R 999:999 "$DATA_DIR"
time docker run \
    -v "$DATA_DIR:/data" \
    --rm -it \
    dimitri/pgcopydb:v0.16 \
    pgcopydb clone \
    --dir /data \
    --source "$SOURCE_DB" \
    --target "$TARGET_DB" \
    --table-jobs=8 \
    --index-jobs=2

And the result was as follows (changed URIs to <source-db> and <target-db> for privacy):

20:25:38.351 1 INFO   Running pgcopydb version 0.16 from "/usr/local/bin/pgcopydb"
20:25:38.352 1 INFO   [SOURCE] Copying database from "postgres://<source-db>?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60"
20:25:38.352 1 INFO   [TARGET] Copying database into "postgres://<target-db>?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60"
20:25:38.374 1 INFO   Using work dir "/data"
20:25:38.752 1 INFO   Exported snapshot "00000003-0001A517-1" from the source database
20:25:38.945 23 INFO   STEP 1: fetch source database tables, indexes, and sequences
20:25:42.520 23 INFO   Fetched information for 16 tables (including 0 tables split in 0 partitions total), with an estimated total of 173 496 tuples and 2739 MB on-disk
20:25:42.587 23 INFO   Fetched information for 53 indexes (supporting 27 constraints)
20:25:42.603 23 INFO   Fetching information for 14 sequences
20:25:43.235 23 INFO   Fetched information for 4 extensions
20:25:43.926 23 INFO   Found 0 indexes (supporting 0 constraints) in the target database
20:25:43.940 23 INFO   STEP 2: dump the source database schema (pre/post data)
20:25:43.947 23 INFO    /usr/bin/pg_dump -Fc --snapshot 00000003-0001A517-1 --section pre-data --file /data/schema/pre.dump 'postgres://<source-db>?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60'
20:25:46.584 23 INFO    /usr/bin/pg_dump -Fc --snapshot 00000003-0001A517-1 --section post-data --file /data/schema/post.dump 'postgres://<source-db>?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60'
20:25:48.617 23 INFO   STEP 3: restore the pre-data section to the target database
20:25:48.704 23 INFO    /usr/bin/pg_restore --dbname '<target-db>?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60' --jobs 2 --use-list /data/schema/pre-filtered.list /data/schema/pre.dump
20:25:50.119 31 INFO   STEP 4: starting 8 table-data COPY processes
20:25:50.287 34 INFO   STEP 8: starting 8 VACUUM processes
20:25:50.356 32 INFO   STEP 6: starting 2 CREATE INDEX processes
20:25:50.356 32 INFO   STEP 7: constraints are built by the CREATE INDEX processes
20:25:50.605 23 INFO   Skipping large objects: none found.
20:25:50.638 23 INFO   STEP 9: reset sequences values
20:25:50.639 53 INFO   Set sequences values on the target database
20:35:29.877 23 INFO   STEP 10: restore the post-data section to the target database
20:35:31.031 23 INFO    /usr/bin/pg_restore --dbname 'postgres://<target-db>?keepalives=1&keepalives_idle=10&keepalives_interval=10&keepalives_count=60' --jobs 2 --use-list /data/schema/post-filtered.list /data/schema/post.dump
20:35:32.711 23 ERROR  pg_restore: error: could not execute query: ERROR:  relation "documents_document" does not exist
20:35:32.711 23 ERROR  LINE 1: SELECT search_data FROM documents_document
20:35:32.711 23 ERROR                                  ^
20:35:32.711 23 ERROR  QUERY:  SELECT search_data FROM documents_document
20:35:32.712 23 ERROR  Command was: REFRESH MATERIALIZED VIEW public.documents_word;
20:35:32.774 23 WARN   pg_restore: warning: errors ignored on restore: 2
20:35:32.774 23 ERROR  Failed to run pg_restore: exit code 1
20:35:32.774 23 ERROR  Failed to finalize schema on the target database, see above for details
20:35:32.774 23 ERROR  Failed to clone source database, see above for details
20:35:32.893 1 ERROR  clone process 23 has terminated [6]

real    10m3.633s
user    0m0.014s
sys     0m0.014s

I successfully migrated another database (which also had one matview) from 14.5 to 16.3 using the exact same procedure.

I'm not sure if this helps, help tried grep for documents_word (the matview that caused the error) on pgcopydb data path and found this:

# grep -rHni documents_word /mnt/data/pgcopydb2 
/mnt/data/pgcopydb2/schema/post-out.list:59:3410; 1259 699431 INDEX public documents_word_word_idx postgres
/mnt/data/pgcopydb2/schema/post-out.list:60:3411; 1259 699433 INDEX public documents_word_word_idx1 postgres
/mnt/data/pgcopydb2/schema/post-out.list:90:3605; 0 699343 MATERIALIZED VIEW DATA public documents_word postgres
grep: /mnt/data/pgcopydb2/schema/pre.dump: binary file matches
/mnt/data/pgcopydb2/schema/pre-filtered.list:30:248; 1259 699343 MATERIALIZED VIEW public documents_word postgres
/mnt/data/pgcopydb2/schema/post-filtered.list:44:3410; 1259 699431 INDEX public documents_word_word_idx postgres
/mnt/data/pgcopydb2/schema/post-filtered.list:45:3411; 1259 699433 INDEX public documents_word_word_idx1 postgres
/mnt/data/pgcopydb2/schema/post-filtered.list:75:3605; 0 699343 MATERIALIZED VIEW DATA public documents_word postgres
grep: /mnt/data/pgcopydb2/schema/post.dump: binary file matches
/mnt/data/pgcopydb2/schema/pre-out.list:45:248; 1259 699343 MATERIALIZED VIEW public documents_word postgres

I can share these files if needed.

Since this matview was not used much, I decided to delete it from the source database, created a new target server and run pgcopydb clone command again. This time it worked successfully! I can recreate the matview manually on the target database later.

I believe this issue is partially related to PR #501 ("the first step towards support for REFRESHing materialized views within pgcopydb instead of in pg_restore"). A possible solution could be rearranging the order of operations in list files: a quick fix would be to always copy matviews and its indexes last and a smarter approach would be to detect the table dependencies for each matview and run them only after the dependent tables have been imported.

turicas avatar Jun 23 '24 13:06 turicas

I tried again with a fresh target database and the options --table-jobs=1 --index-jobs=1 --restore-jobs=1 to check if reducing parallelism might resolve the issue, but the same error occurred.

turicas avatar Jun 23 '24 14:06 turicas

Hello @turicas

Can you check whether the definition of your materialized view documents_word contain only qualified references to documents_document relation? If so, I suggest you fix the definition by qualifying them with schema names.

If there exists an unqualified reference to documents_document relation, pg_restore may fail. This is an intentional failure on pg_restore as allowing object lookups along the search path is prone to abuse.

hanefi avatar Jun 24 '24 20:06 hanefi