pgloader
pgloader copied to clipboard
Duration at the summary page is wrong
I use pgloader to migrate a huge database from MariaDB to PostgreSQL. Based on the summary page, the migration takes too long, so I tried to adjust the parameters. Then I noticed that either the duration in the summary page or the times in the NOTICE lines are wrong.
The summary shows almost 8 hours, but the job took only 2,5 hours.
I can always reproduce this, even on different machines. All machines are virtual, running in rather big data centres at the office or at my customer.
- [x] pgloader --version pgloader version "3.6.af8c3c1" compiled with Clozure Common Lisp Version 1.12.2 (v1.12.2) LinuxX8664
Also reproducable with the other compiler (forget the name).
-
[x] did you test a fresh compile from the source tree? yes
-
[x] did you search for other similar issues? yes, none found.
-
[x] how can I reproduce the bug? yes, always
Started with
pgloader --client-min-messages=notice database.load
database.load
LOAD DATABASE
FROM mysql://root:password@host/database
INTO pgsql://root:password@localhost:5432/database
WITH
no truncate,
create tables,
include drop,
create indexes,
reset sequences,
foreign keys,
downcase identifiers,
uniquify index names,
prefetch rows = 10000,
rows per range = 10000,
batch rows = 10000,
workers = 4,
concurrency = 1
SET MySQL PARAMETERS
net_read_timeout = '500',
net_write_timeout = '500'
;
I can't attach a database dump, because my database is over 200 GB. Using an empty database is to fast to compare the times.
- [x] pgloader output you obtain
2024-02-21T17:56:04.000581+01:00 NOTICE Starting pgloader, log system is ready.
2024-02-21T17:56:04.001357+01:00 LOG pgloader version "3.6.af8c3c1"
[...]
2024-02-21T20:29:39.970798+01:00 LOG report summary reset
table name errors read imported bytes total time read write
---------------------------------- --------- --------- --------- --------- -------------- --------- ---------
fetch meta data 0 211 211 0.085s
Create Schemas 0 0 0 0.002s
Create SQL Types 0 0 0 0.001s
Create tables 0 100 100 0.238s
Set Table OIDs 0 50 50 0.001s
---------------------------------- --------- --------- --------- --------- -------------- --------- ---------
[...]
---------------------------------- --------- --------- --------- --------- -------------- --------- ---------
COPY Threads Completion 0 8 8 1h34m24.228s
Create Indexes 0 111 111 5h26m22.551s
Index Build Completion 0 111 111 38m51.722s
Reset Sequences 0 27 27 6.022s
Primary Keys 0 49 49 0.168s
Create Foreign Keys 0 50 50 20m10.162s
Create Triggers 0 0 0 0.001s
Install Comments 0 0 0 0.000s
after load 0 23 23 0.512s
---------------------------------- --------- --------- --------- --------- -------------- --------- ---------
Total import time ✓ 154281165 154281165 15.2 GB 7h59m55.363s
I removed information about the tables because they are private.
-
[x] data that is being loaded, if relevant yes, all data has been loaded.
-
[x] How the data is different from what you expected, if relevant Does not apply.
[user@host programs]$ /path/to/pgloader/bin/pgloader --client-min-messages=notice database.load
2024-02-22T10:43:40.001437+01:00 NOTICE Starting pgloader, log system is ready.
2024-02-22T10:43:40.007689+01:00 LOG pgloader version "3.6.af8c3c1"
[...]
2024-02-23T02:34:37.652045+01:00 LOG report summary reset
table name errors read imported bytes total time read write
---------------------------------- --------- --------- --------- --------- -------------- --------- ---------
fetch meta data 0 211 211 0.494s
Create Schemas 0 0 0 0.001s
Create SQL Types 0 0 0 0.000s
Create tables 0 100 100 0.360s
Set Table OIDs 0 50 50 0.002s
---------------------------------- --------- --------- --------- --------- -------------- --------- ---------
[...]
---------------------------------- --------- --------- --------- --------- -------------- --------- ---------
COPY Threads Completion 0 8 8 14h37m57.750s
Create Indexes 0 111 111 4h47m9.918s
Index Build Completion 0 111 111 48m26.715s
Reset Sequences 0 27 27 0.870s
Primary Keys 0 49 49 1.044s
Create Foreign Keys 0 50 50 24m11.139s
Create Triggers 0 0 0 0.000s
Install Comments 0 0 0 0.000s
after load 0 23 23 0.364s
---------------------------------- --------- --------- --------- --------- -------------- --------- ---------
Total import time ✓ 4449186294 4449186294 227.9 GB 20h37m47.805s
Output from the virtual server at my customer.
Duration based on the log lines = ~ 15.5 hours Duration reported by pgLoader = 20.5 hours
We are having the same issues. On our end it seems that the "Create Indexes" part makes no sense. As an example, this is a summary which only migrates the schema. Real time it took was 14 sec:
table name errors rows bytes total time
---------------------- --------- --------- --------- --------------
fetch meta data 0 3405 0.293s
Create Schemas 0 0 0.437s
Create SQL Types 0 0 0.004s
Create tables 0 1062 3.530s
Set Table OIDs 0 531 0.047s
---------------------- --------- --------- --------- --------------
---------------------- --------- --------- --------- --------------
Index Build Completion 0 1906 8.285s
Create Indexes 0 1906 1m38.173s
Reset Sequences 0 310 0.144s
Primary Keys 0 521 0.282s
Create Foreign Keys 0 968 0.785s
Create Triggers 0 0 0.000s
Install Comments 0 664 0.274s
after load 0 2 0.013s
---------------------- --------- --------- --------- --------------
Total import time ✓ 0 1m47.957s