ReplicaDB
ReplicaDB copied to clipboard
postgres sink - extra data after last expected column
Describe the bug Running replication between MySQL -> Postgres
./bin/replicadb --options-file replicadb.conf
2023-07-03 08:49:46,554 INFO ReplicaDB:63 Running ReplicaDB version: 0.15.0
2023-07-03 08:49:46,559 INFO ReplicaDB:66 Setting verbose mode DEBUG
2023-07-03 08:49:46,559 DEBUG ReplicaDB:67 ToolOptions{
sourceConnect='jdbc:mysql://myserver:3306/bugs?useSSL=true&trustServerCertificate=true',
sourceUser='whatever',
sourcePassword='****',
sourceTable='bugs.attach_data',
sourceColumns='null',
sourceWhere='null',
sourceQuery='null',
sinkConnect='jdbc:postgresql://myserver:5432/bugs',
sinkUser='whatever',
sinkPassword='****',
sinkTable='attach_data',
sinkStagingTable='null',
sinkStagingSchema='null',
sinkStagingTableAlias='null',
sinkColumns='null',
sinkDisableEscape=false,
sinkDisableIndex=false,
sinkDisableTruncate=false,
sinkAnalyze=false,
jobs=5,
BandwidthThrottling=0,
quotedIdentifiers=false,
fetchSize=500,
help=false,
version=false,
verbose=DEBUG,
optionsFile='replicadb.conf',
mode='complete',
sentryDsn='https://[email protected]/5962725',
sourceConnectionParams={},
sinkConnectionParams={},
sourceFileFormat='null',
sinkFileformat='null'}
2023-07-03 08:49:46,653 INFO Sentry:27 Sentry enabled
2023-07-03 08:49:47,608 DEBUG MySQLManager:348 Calculating the chunks size with this sql: SELECT CEIL(count(*) / 5) chunk_size, count(*) total_rows FROM bugs.attach_data
2023-07-03 08:49:47,673 DEBUG MySQLManager:353 chunkSize: 30311 totalNumberRows: 151555
2023-07-03 08:49:47,687 INFO SqlManager:373 Truncating sink table with this command: TRUNCATE TABLE attach_data
2023-07-03 08:49:48,400 INFO ReplicaTask:35 Starting TaskId-4
2023-07-03 08:49:48,400 INFO ReplicaTask:35 Starting TaskId-1
2023-07-03 08:49:48,400 INFO ReplicaTask:35 Starting TaskId-3
2023-07-03 08:49:48,400 INFO ReplicaTask:35 Starting TaskId-2
2023-07-03 08:49:48,400 INFO ReplicaTask:35 Starting TaskId-0
2023-07-03 08:49:48,943 INFO SqlManager:128 TaskId-2: Executing SQL statement: SELECT * FROM bugs.attach_data LIMIT ? OFFSET ?
2023-07-03 08:49:48,950 INFO SqlManager:128 TaskId-0: Executing SQL statement: SELECT * FROM bugs.attach_data LIMIT ? OFFSET ?
2023-07-03 08:49:48,950 INFO SqlManager:128 TaskId-1: Executing SQL statement: SELECT * FROM bugs.attach_data LIMIT ? OFFSET ?
2023-07-03 08:49:48,950 INFO SqlManager:128 TaskId-4: Executing SQL statement: SELECT * FROM bugs.attach_data LIMIT ? OFFSET ?
2023-07-03 08:49:48,957 INFO SqlManager:128 TaskId-3: Executing SQL statement: SELECT * FROM bugs.attach_data LIMIT ? OFFSET ?
2023-07-03 08:49:48,968 DEBUG SqlManager:133 TaskId-2: Using fetchSize for next query: 500
2023-07-03 08:49:48,968 DEBUG SqlManager:133 TaskId-3: Using fetchSize for next query: 500
2023-07-03 08:49:48,968 DEBUG SqlManager:133 TaskId-1: Using fetchSize for next query: 500
2023-07-03 08:49:48,968 DEBUG SqlManager:133 TaskId-4: Using fetchSize for next query: 500
2023-07-03 08:49:48,968 DEBUG SqlManager:133 TaskId-0: Using fetchSize for next query: 500
2023-07-03 08:49:48,971 INFO SqlManager:148 TaskId-0: With args: 30311, 0,
2023-07-03 08:49:48,971 INFO SqlManager:148 TaskId-1: With args: 30311, 30311,
2023-07-03 08:49:48,971 INFO SqlManager:148 TaskId-3: With args: 30311, 90933,
2023-07-03 08:49:48,971 INFO SqlManager:148 TaskId-4: With args: 30311, 121244,
2023-07-03 08:49:48,971 INFO SqlManager:148 TaskId-2: With args: 30311, 60622,
2023-07-03 08:49:50,925 WARN ConnManager:188 Options source-columns and sink-columns are null, getting from Source ResultSetMetaData: id,thedata
2023-07-03 08:49:50,932 INFO PostgresqlManager:168 Copying data with this command: COPY attach_data (id,thedata) FROM STDIN WITH DELIMITER e'\x1f' ENCODING 'UTF-8'
2023-07-03 08:49:56,373 INFO PostgresqlManager:168 Copying data with this command: COPY attach_data (id,thedata) FROM STDIN WITH DELIMITER e'\x1f' ENCODING 'UTF-8'
2023-07-03 08:50:40,643 INFO PostgresqlManager:168 Copying data with this command: COPY attach_data (id,thedata) FROM STDIN WITH DELIMITER e'\x1f' ENCODING 'UTF-8'
2023-07-03 08:51:43,430 INFO PostgresqlManager:168 Copying data with this command: COPY attach_data (id,thedata) FROM STDIN WITH DELIMITER e'\x1f' ENCODING 'UTF-8'
2023-07-03 08:52:32,807 INFO PostgresqlManager:168 Copying data with this command: COPY attach_data (id,thedata) FROM STDIN WITH DELIMITER e'\x1f' ENCODING 'UTF-8'
2023-07-03 08:56:06,350 ERROR ReplicaTask:69 ERROR in TaskId-0 inserting data to sink table: [ERROR: invalid input syntax for type bytea
Where: COPY attach_data, line 2, column thedata: "RCS file: /repository/ghost/src/client_side.c,v
retrieving revision 1.44.2.4
diff -u -r1.44.2.4 clie..."]
2023-07-03 09:00:08,807 ERROR ReplicaTask:69 ERROR in TaskId-1 inserting data to sink table: [ERROR: extra data after last expected column
Where: COPY attach_data, line 2: "73111��ࡱ�\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N\N>\N\N�� \N\N\N\N\N\N\N\N\N\N\N\N\N\N\N�..."]
2023-07-03 09:01:22,591 ERROR ReplicaTask:69 ERROR in TaskId-2 inserting data to sink table: [ERROR: extra data after last expected column
Where: COPY attach_data, line 4: "348304�PNG\r\n\n\N\N\N\rIHDR\N\NL\N\N\N\N\N�K\nX\N\N\NJtEXtSoftware\NRRDtool, Tobias Oe..."]
2023-07-03 09:02:54,320 ERROR ReplicaTask:69 ERROR in TaskId-4 inserting data to sink table: [ERROR: extra data after last expected column
Where: COPY attach_data, line 2: "907571PK\N\\N\N\N!\Nv2�\N\NZ
\N\N\[Content_Types].xml �(�\N\N\N\N\N\N\N\N\N\N..."]
2023-07-03 09:07:23,519 ERROR ReplicaTask:69 ERROR in TaskId-3 inserting data to sink table: [ERROR: extra data after last expected column
Where: COPY attach_data, line 3: "621129����\NJFIF\N\N`\N`\N\N��\NC\N
\\n\..."]
2023-07-03 09:07:23,523 ERROR ReplicaDB:134 Got exception running ReplicaDB:
java.util.concurrent.ExecutionException: org.postgresql.util.PSQLException: ERROR: invalid input syntax for type bytea
Where: COPY attach_data, line 2, column thedata: "RCS file: /repository/ghost/src/client_side.c,v
retrieving revision 1.44.2.4
diff -u -r1.44.2.4 clie..."
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:?]
at java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:?]
at org.replicadb.ReplicaDB.processReplica(ReplicaDB.java:115) [ReplicaDB-0.15.0.jar:0.15.0]
at org.replicadb.ReplicaDB.main(ReplicaDB.java:47) [ReplicaDB-0.15.0.jar:0.15.0]
Caused by: org.postgresql.util.PSQLException: ERROR: invalid input syntax for type bytea
Where: COPY attach_data, line 2, column thedata: "RCS file: /repository/ghost/src/client_side.c,v
retrieving revision 1.44.2.4
diff -u -r1.44.2.4 clie..."
at org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2675) ~[postgresql-42.3.7.jar:42.3.7]
at org.postgresql.core.v3.QueryExecutorImpl.processCopyResults(QueryExecutorImpl.java:1263) ~[postgresql-42.3.7.jar:42.3.7]
at org.postgresql.core.v3.QueryExecutorImpl.endCopy(QueryExecutorImpl.java:1068) ~[postgresql-42.3.7.jar:42.3.7]
at org.postgresql.core.v3.CopyInImpl.endCopy(CopyInImpl.java:49) ~[postgresql-42.3.7.jar:42.3.7]
at org.replicadb.manager.PostgresqlManager.insertDataToTable(PostgresqlManager.java:134) ~[ReplicaDB-0.15.0.jar:0.15.0]
at org.replicadb.ReplicaTask.call(ReplicaTask.java:65) ~[ReplicaDB-0.15.0.jar:0.15.0]
at org.replicadb.ReplicaTask.call(ReplicaTask.java:15) ~[ReplicaDB-0.15.0.jar:0.15.0]
at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?]
at java.lang.Thread.run(Thread.java:829) ~[?:?]
2023-07-03 09:07:23,631 INFO ReplicaDB:54 Total process time: 1057100ms
To Reproduce Steps to reproduce the behaviour: config above
Source table DDL
id
= mediumint(9) PK
thedata
= longblob
Sink table DDL
in
= integer
thedata
= bytea
ReplicaDB configuration options-file. Expected behavior A clear and concise description of what you expected to happen.
Additional context
Hi @ScottChapman
To open a new issue, please, follow the issue template:
Describe the bug A clear and concise description of what the bug is, with verbose log.
To Reproduce Steps to reproduce the behaviour:
- Source table DDL
- Sink table DDL
- ReplicaDB configuration
options-file
.
Expected behavior A clear and concise description of what you expected to happen.
Additional context Add any other context about the problem here. Running environment (cloud, on premise, java version..), source and sink technologies (Oracle, MySQL, Postgres...)
Hi @ScottChapman
To open a new issue, please, follow the issue template:
Describe the bug A clear and concise description of what the bug is, with verbose log.
To Reproduce Steps to reproduce the behaviour:
- Source table DDL
- Sink table DDL
- ReplicaDB configuration
options-file
.Expected behavior A clear and concise description of what you expected to happen.
Additional context Add any other context about the problem here. Running environment (cloud, on premise, java version..), source and sink technologies (Oracle, MySQL, Postgres...)
Sorry, updated above. I'll close the other one since both errors are in same run. Thanks!
Hi @ScottChapman
I have made a fix that may solve your issue. Can you try the latest release? https://github.com/osalvador/ReplicaDB/releases/tag/v0.15.1
Regards!
Yes, better. I do get the error but much less frequently. Is it possible to dump any information that might be helpful in identifying the data that is causing the problem for me?
@osalvador - are the copy commands kept some place? I assume there is a tmp file for the SQL commands that are executed? Is it possible to preserve that for debugging purposes?
OK. I think I narrowed it down to the data that is problematic. There is a bunch of unicode data in it. I can probably clean it up a bit so I can share it if you are interested.
Hi @ScottChapman
How was it?