Migration Data SqL server to Postgres fails me in big tables, the process stuck
Thanks for contributing to pgloader by reporting an issue! Reporting an issue is the only way we can solve problems, fix bugs, and improve both the software and its user experience in general.
The best bug reports follow those 3 simple steps:
-
show what you did, I run the pgloader for migrate a database from SQL server to Postgres. we have issues in the tables with more than 2 million records, I can not migrate big tables the process stuck
-
show the result you got,
[postgres@rbalvdpost03 Archive]$ pgloader --verbose auditRecordsMaterialized.load 2024-11-08T15:32:21.002000Z NOTICE Starting pgloader, log system is ready. 2024-11-08T15:32:21.010000Z LOG pgloader version "3.6.7~devel" 2024-11-08T15:32:21.157001Z LOG Migrating from #<MSSQL-CONNECTION mssql://[email protected]:1433/Core.Pack232Sandpit {10067E6D43}> 2024-11-08T15:32:21.157001Z LOG Migrating into #<PGSQL-CONNECTION pgsql://[email protected]:5432/Core.Pack232BigTable {10067E7043}> Max connections reached, increase value of TDS_MAX_CONN 2024-11-08T15:32:21.538003Z NOTICE Prepare PostgreSQL database. 2024-11-08T15:32:21.678004Z NOTICE COPY audittrail.auditrecordsmaterialized with 0 rows estimated [3/4] Max connections reached, increase value of TDS_MAX_CONN
The console show this, not advances more,
In the Sql server I see the Select consult of the table is supended state and it has async_network
and in Postgres server I see the copy command but it is in idle status.
My load file is this one:
load database from mssql://sqlpostgres_user:[email protected]/Core.Pack232Sandpit into postgresql://postgres2:[email protected]/Core.Pack232BigTable
WITH include drop, truncate, create tables, create indexes, reset sequences, foreign keys
including only table names like 'AuditRecordsMaterialized' in schema 'AuditTrail'
SET work_mem to '256MB', maintenance_work_mem to '1024 MB';
- explain how the result is not what you expected.
The tables is not migrated to the postgres server it has 6 millions of records, other tables of the same database and less records have been migrated.
Any idea how I face this.
Best Regards
In the case of pgloader, here's the information I will need to read in your bug report. Having all of this is a big help, and often means the bug you reported can be fixed very efficiently as soon as I get to it.
Please provide the following information:
```
pgloader version "3.6.7~devel" compiled with SBCL 2.2.10-1.rhel9 ```
-
[ ] did you test a fresh compile from the source tree? No
Compiling pgloader from sources is documented in the README, it's easy to do, and if patches are to be made to fix your bug, you're going to have to build from sources to get the fix anyway…
-
[ ] did you search for other similar issues? No
-
[ ] how can I reproduce the bug? No
Incude a self-contained pgloader command file.
If you're loading from a database, consider attaching a database dump to your issue. For MySQL, use
mysqldump. For SQLite, just send over your source file, that's easy. Maybe be the one with your production data, of course, the one with just the sample of data that allows me to reproduce your bug.When using a proprietary database system as a source, consider creating a sample database on some Cloud service or somewhere you can then give me access to, and see my email address on my GitHub profile to send me the credentials. Still open a public issue for tracking and as documentation for other users.
--
-- EDIT THIS FILE TO MATCH YOUR BUG REPORT
--
LOAD CSV
FROM INLINE with encoding 'ascii'
INTO postgresql:///pgloader
TARGET TABLE jordane
WITH truncate,
fields terminated by '|',
fields not enclosed,
fields escaped by backslash-quote
SET work_mem to '128MB',
standard_conforming_strings to 'on'
BEFORE LOAD DO
$$ drop table if exists jordane; $$,
$$ CREATE TABLE jordane
(
"NOM" character(20),
"PRENOM" character(20)
)
$$;
BORDET|Jordane
BORDET|Audrey
LASTNAME|"opening quote
BONNIER|testprenombe~aucouptroplong
JOURDAIN|héhé¶
- [ ] pgloader output you obtain
PASTE HERE THE OUTPUT OF THE PGLOADER COMMAND
- [ ] data that is being loaded, if relevant
PASTE HERE THE DATA THAT HAS BEEN LOADED
- [ ] How the data is different from what you expected, if relevant
I had a similar issue with the above, running the Docker image in Windows 10. I had tried many ways, including allowing more resources to WSL2 and disabling swap, however the only way I got it not fail due to heap exhaustion for me on big tables was including the following parameter in the WITH clause:
prefetch rows = 10000
This can be adjusted accordingly to your data volume.
Hi Cio, I really appreciate your post. I did as you say increasing to 50000 however this not works, I don't know how many prefetch rows I have to set to 3000000 of register.
At the beginning, when I run the pgloader in the console using the ps command, I can see it is working. The status is COPY. After some minutes, the status appears as dle in transaction (aborted).
Also I see my CPU is very high
I don't know if I need more resources.
Thank you in advance
this is my cpu processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 79 model name : Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz
Hi Cio, I really appreciate your post. I did as you say increasing to 50000 however this not works, I don't know how many prefetch rows I have to set to 3000000 of register.
At the beginning, when I run the pgloader in the console using the ps command, I can see it is working. The status is COPY. After some minutes, the status appears as dle in transaction (aborted). Also I see my CPU is very high
I don't know if I need more resources. Thank you in advance
I am not an expert on this, but I will suggest some things to help your troubleshooting.
- Perhaps try with a lower value of prefetch rows? Find below my load options that I used to make it work with my setup (I am loading directly from SQLServer, you can ignore the CAST and BEFORE/AFTER LOAD DO sections I have)
LOAD DATABASE
<omitted connection strings>
WITH
include drop,
create tables,
create indexes,
reset sequences,
prefetch rows = 1000
SET
work_mem to '512MB',
maintenance_work_mem to '1024MB'
CAST
type datetime to timestamptz using zero-dates-to-null,
type int when (= precision 10) to bigint,
type nvarchar to text drop typemod using remove-null-characters
BEFORE LOAD DO
$$ CREATE EXTENSION IF NOT EXISTS "uuid-ossp"; $$,
$$ CREATE EXTENSION IF NOT EXISTS "system_stats"; $$
AFTER LOAD DO
$$ GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO postgres; $$;
With the above, I was able to load a database that had a table with 23,653,246 rows (and consumed about 4GB of space)
- As I am assuming you are running this with Docker on Windows, I found sometimes the console log was not reliable or helpful enough to help with troubleshooting, or even monitoring the progress. While the process is running, check the following folder on the Docker Container (the pgloader.log file will also help), and may give you some further hints on where the process is hanging:
I hope this helps.
Hi CIO, I’m really grateful for your help, I will test this next monday, crossfingers. At the moment I have not access to the logs. I have not root access at the moment. I will check them as soon as I have them. Thanks again
I don't know if I need more resources. Thank you in advance