[ora2pg - 24.3] Oracle source DB TEST_DATA with PARALLEL_TABLES > 1: ORA-03113: end-of-file on communication channel
Hello,
Using ora2pg 24.3 and trying verify data migrated through the export type TEST_DATA with an Oracle source database, using PARALLEL_TABLES > 1 result in ORA-03113: end-of-file on communication channel after first batch of parallel processing, followed by ORA-03135: connection lost contact on each next table iteration.
Here is an extract of the output using PARALLEL_TABLES = 2:
Result will be written to file /report-migration-status-data_logs/data_validation.log
[2025-04-14 08:45:21] DEBUG: cloning connection to Oracle.
[2025-04-14 08:45:21] DEBUG: cloning connection to Oracle.
[2025-04-14 08:45:22] DEGUG: Query sent to Oracle: SELECT a."A_COLUMN_1",a."A_COLUMN_2",... FROM "MY_SCHEMA"."A_TABLE" a
[2025-04-14 08:45:22] Get rows from oracle tables: SELECT * FROM ( SELECT a."A_COLUMN_1",a."A_COLUMN_1" FROM "MY_SCHEMA"."A_TABLE" a ORDER BY 1,2,3,... ) WHERE ROWNUM <= 10000
[2025-04-14 08:45:22] Get rows from migrated tables: SELECT a_column_1,a_column_2,... FROM my_schema.a_table ORDER BY 1,2,... LIMIT 10000
[2025-04-14 08:45:22] Checking data validation for table
[2025-04-14 08:45:22] DEBUG: cloning connection to Oracle.
[2025-04-14 08:45:22] DEBUG: cloning connection to Oracle.
[2025-04-14 08:45:23] DEGUG: Query sent to Oracle: SELECT a."A_COLUMN_1",a."A_COLUMN_2",... FROM "MY_SCHEMA"."ANOTHER_TABLE" a
[2025-04-14 08:45:23] Get rows from oracle tables: SELECT * FROM ( SELECT a."A_COLUMN_1",a."A_COLUMN_2",... FROM "MY_SCHEMA"."ANOTHER_TABLE" a ORDER BY 1 ) WHERE ROWNUM <= 10000
[2025-04-14 08:45:23] DEGUG: Query sent to Oracle: SELECT a."A_COLUMN_1",a."A_COLUMN_2",... FROM "MY_SCHEMA"."ANOTHER_TABLE_2" a
[2025-04-14 08:45:23] Get rows from oracle tables: SELECT * FROM ( SELECT a."A_COLUMN_1",a."A_COLUMN_2",... FROM "MY_SCHEMA"."ANOTHER_TABLE_2" a ORDER BY 1,2,... ) WHERE ROWNUM <= 10000
[2025-04-14 08:45:23] Get rows from migrated tables: SELECT a_column_1,a_column_2... FROM my_schema.another_table ORDER BY 1 LIMIT 10000
[2025-04-14 08:45:23] Checking data validation for table
[2025-04-14 08:45:23] Get rows from migrated tables: SELECT a_column_1,a_column_2,... FROM my_schema.another_table_2 ORDER BY 1,2,... LIMIT 10000
[2025-04-14 08:45:23] Checking data validation for table
[2025-04-14 08:45:23] DEBUG: cloning connection to Oracle.
[2025-04-14 08:45:23] FATAL: _column_attributes() ORA-03113: end-of-file on communication channel
Process ID: 58917276
Session ID: 45 Serial number: 5949 (DBD ERROR: OCIStmtExecute/Describe)
Aborting export...
Issuing rollback() due to DESTROY without explicit disconnect() of DBD::Oracle::db handle (DESCRIPTION=(ADDRESS=(HOST=my-oracle.host.com)(PORT=1521)(PROTOCOL=tcp))(CONNECT_DATA=(SERVICE_NAME=IORA2))) at /usr/local/lib/perl5/site_perl/5.38.2/Ora2Pg.pm line 15529.
[2025-04-14 08:45:24] DEBUG: cloning connection to Oracle.
[2025-04-14 08:45:24] FATAL: _column_attributes() ORA-03135: connection lost contact
Process ID: 58917276
Session ID: 45 Serial number: 5949 (DBD ERROR: OCIStmtExecute/Describe)
Aborting export...
Issuing rollback() due to DESTROY without explicit disconnect() of DBD::Oracle::db handle (DESCRIPTION=(ADDRESS=(HOST=my-oracle.host.com)(PORT=1521)(PROTOCOL=tcp))(CONNECT_DATA=(SERVICE_NAME=IORA2))) at /usr/local/lib/perl5/site_perl/5.38.2/Ora2Pg.pm line 15529.
[2025-04-14 08:45:24] DEBUG: cloning connection to Oracle.
[2025-04-14 08:45:25] FATAL: _column_attributes() ORA-03135: connection lost contact
Process ID: 58917276
Session ID: 45 Serial number: 5949 (DBD ERROR: OCIStmtExecute/Describe)
Aborting export...
Issuing rollback() due to DESTROY without explicit disconnect() of DBD::Oracle::db handle (DESCRIPTION=(ADDRESS=(HOST=my-oracle.host.com)(PORT=1521)(PROTOCOL=tcp))(CONNECT_DATA=(SERVICE_NAME=IORA2))) at /usr/local/lib/perl5/site_perl/5.38.2/Ora2Pg.pm line 15529.
...
note: the Ora2Pg.pm line mentioned is specific to my context as I have a patched 24.3 version of Ora2Pg.
Analysis:
Ora2Pg.pm_data_validationfunction is spawningcompare_datafunction iterativelyOra2Pg.pmcompare_datafunction is calling_column_attributesOra2pg.pm_column_attributesfunction is callingOra2Pg/Oracle.pm_column_attributesOra2Pg/Oracle.pm_column_attributesfunction is callingprepareon the Ora2Pg Oracle DB handler
That handler may have been closed by a previous parallel processing (I haven't identified where it occurs in the processing of a table).
But, for sure, the first error _column_attributes() ORA-03113: end-of-file on communication channel kicks in and other iterations throws the ORA-03135: connection lost contact errors in this context.
As _column_attributes function must be called by table, I suggest moving its calls back to the start of the iteration, appending its results to the colinfo hash.
This way, regardless of whether we are in PARALLEL_TABLES (spawned or not), colinfo will be available to the compare_data function for the table being processed.
Here is a proposed patch:
--- Ora2Pg.pm
+++ Ora2Pg.pm
@@ -1,10 +1,17 @@
my $q = 1;
foreach my $f (@foreign_tables)
{
next if (!exists $list_tables{"\L$f\E"});
+ if (!$self->{fdw_server}) {
+ # Extract all column information used to determine data export.
+ # This hash will be used in function _howto_get_data()
+ # Appending the column information to the hash
+ %{$self->{colinfo}} = (%{$self->{colinfo}}, $self->_column_attributes($f, $self->{schema}, 'TABLE'));
+ }
+
if ($self->{parallel_tables} > 1)
{
while ($self->{child_count} >= $self->{parallel_tables})
{
my $kid = waitpid(-1, WNOHANG);
@@ -1,13 +1,10 @@
$sth = $dbhora->prepare($sql) or $self->logit("ERROR: " . $dbhora->errstr . "\n", 0, 0);
$sth->execute or $self->logit("FATAL: " . $dbhora->errstr . "\n", 0, 0);
}
else
{
- # Extract all column information used to determine data export.
- # This hash will be used in function _howto_get_data()
- %{$self->{colinfo}} = $self->_column_attributes($tb, $self->{schema}, 'TABLE');
$sql = $self->_get_oracle_test_data($tb);
# Oracle data lookup
if ($self->{data_validation_rows})
{
if ($self->{db_version} =~ /Release (8|9|10|11)/)
Commit 881aa8c might solve this issue. Actually I think that the problem was that main ora2pg process was ending without waiting for last forked processes end.
@tanguydelignieresaccenture How does your test database looks like? I'm having the same error, with 25.0 as well, but I get the error after 2h30m in COPY type. No errors on TEST_DATA type