Jailer icon indicating copy to clipboard operation
Jailer copied to clipboard

Subsetting SQL script generation fails with '"Connection reset" in statement "Insert into JAILER_ENTITY...' after having been running for 10s of hours

Open shabbir-genetech opened this issue 3 years ago • 1 comments

A subsetting SQL script generation that I'm trying to run is taking a long time and then failing with messages like:

[jailer-main] ERROR - Job-error
net.sf.jailer.database.SqlException: "Connection reset" in statement "Insert into JAILER_ENTITY (r_entitygraph, PK0, PK4, birthday, type) Select distinct TOP(30000) 28871 as graph_id, ...

I have ran this operation twice at this point. The first run failed with this issue after running for 10+ hours and having 'collected' 35 million+ rows. The second run failed with this issue after running for 30+ hours and having 'collected' 50 million+ rows. I restarted Jailer between the two runs.

The database I am trying to subset is a large production one and it is remote to the system running Jailer. I am not sure if the long running times are a cause/symptom of the underlying issue or are just due to the size of the DB and Jailer running remotely.

Both failures occurred on different tables and on different 'generations/days'.

The issue is two-fold:

  1. Is it safe to assume that the issue was connectivity related? Any known remedies I could try? Does Jailer retry this operation if it fails? Is there a way to configure the number of and waiting period between the retries?
  2. Once the error occurred, the process basically stopped and there was no way to resume and no way to retrieve the already 'collected' rows. Perhaps there should be options for both of those operations.

Any advice/insight on how to get the subsetting completed will be highly appreciated.

And thank you for creating this amazing piece of software.

Debug Information

Jailer version: 12.5.3.4 Java: Java HotSpot(TM) 64-Bit Server VM 25.341-b10 (1.8.0_341) System: Windows 10 Version 10.0 running on amd64; Cp1252; US Memory: 1,053,331,064 free (94%), 1,118,830,592 max (From the 'about' screen after the second failure) Database system: Azure SQL Server

shabbir-genetech avatar Aug 07 '22 16:08 shabbir-genetech

Hi,

The statement that failed here was executed on the remote database (MS SQL). Probably the DBMS timed out there because the request took too long.

You can try to avoid such errors by selecting "local database" as "working table scope" (in "Data Export" dialog). However, the total runtime will increase. But long running "Insert Into..." statements will be prevented.

Basically, Jailer is a database client that makes requests to a DBMS. If the DBMS aborts a query with an error, the tool's work cannot continue either. Also, the usual rules apply to the runtimes of the queries. Indexes may be necessary to bring the performance to a reasonable level (and thus avoid timeouts). With Jailer these are usually missing indexes on foreign key columns, because depending on the extraction model associations are also resolved in the direction of parent -> child.

Is it intended to extract tens of millions of rows? As you can see, the rows to be exported (resp. their PKs) are first collected in a table with "Insert Into" statements. Of course, the DBMS must be enabled for this, so that it does not abort the statements because some resources are not sufficient.

Wisser avatar Aug 08 '22 08:08 Wisser