exist icon indicating copy to clipboard operation
exist copied to clipboard

Possible issue - 1st incremental following full backup takes 8x longer than the full backup

Open bkcsfi opened this issue 1 year ago • 10 comments

We are experiencing unusually slow backup performance.

  1. full backup - about 8 minutes
  2. incremental backup - about 25 minutes
  3. first incremental that follows a full backup - 65 minutes

Using 'official' eXist docker image:

REPOSITORY                                         TAG           IMAGE ID
existdb/existdb                                    6.2.0         772227099fd0

You are running
eXist 6.2.0 c8fa4958b6d4a50bd0cba7f3e76a150226414187 2023-02-04T22:42:29Z

conf.xml:

<job type="system" name="check1"
     class="org.exist.storage.ConsistencyCheckTask"
     cron-trigger="0 0 19 * * ?">
     <parameter name="output" value="/backup/exist2303"/>
     <parameter name="backup" value="yes"/>
     <parameter name="incremental" value="yes"/>
     <parameter name="incremental-check" value="no"/>
 </job>
 <job type="system" name="check2"
     class="org.exist.storage.ConsistencyCheckTask"
     cron-trigger="0 0 1 * * ?">
     <parameter name="output" value="/backup/exist2303"/>
     <parameter name="backup" value="yes"/>
     <parameter name="incremental" value="no"/>
     <parameter name="incremental-check" value="no"/>
 </job>

I've pasted portions of the exist.log below.

Is this expected behavior? If so, we can switch to run full backups all the time.

Thanks

~$ grep backup /opt/docker_volumes/exist2303/logs/exist.log 
2023-05-25 19:00:00,002 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-05-25 19:25:47,863 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230525-1900.zip 
2023-05-25 19:25:47,863 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-05-26 19:00:00,002 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-05-26 19:08:14,894 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/full20230526-1900.zip 
2023-05-26 19:08:14,916 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-05-27 19:00:00,002 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-05-27 20:08:51,722 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230527-1900.zip 
2023-05-27 20:08:51,723 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-05-28 19:00:00,001 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-05-28 19:24:45,218 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230528-1900.zip 
2023-05-28 19:24:45,218 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-05-29 19:00:00,002 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-05-29 19:20:56,310 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230529-1900.zip 
2023-05-29 19:20:56,322 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-05-30 19:00:00,002 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-05-30 19:23:49,698 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230530-1900.zip 
2023-05-30 19:23:49,699 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-05-31 19:00:00,007 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-05-31 19:08:20,285 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/full20230531-1900.zip 
2023-05-31 19:08:20,293 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-01 19:00:00,007 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-01 20:04:42,459 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230601-1900.zip 
2023-06-01 20:04:42,459 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-02 19:00:00,001 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-02 19:27:14,564 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230602-1900.zip 
2023-06-02 19:27:14,565 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-03 19:00:00,000 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-03 19:29:16,353 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230603-1900.zip 
2023-06-03 19:29:16,353 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-04 19:00:00,002 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-04 19:25:00,986 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230604-1900.zip 
2023-06-04 19:25:00,986 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-05 19:00:00,001 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-05 19:12:19,420 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/full20230605-1900.zip 
2023-06-05 19:12:19,427 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-06 19:00:00,003 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-06 20:05:24,414 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230606-1900.zip 
2023-06-06 20:05:24,414 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 

bkcsfi avatar Jun 07 '23 00:06 bkcsfi

Strangely I don't see entries in the log for the 1am backup task, but I do find these entries. Does this indicate there is a problem?

2023-05-26 01:00:00,002 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 
2023-05-27 01:00:00,007 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 
2023-05-28 01:00:00,001 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 
2023-05-29 01:00:00,001 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 
2023-05-30 01:00:00,005 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 
2023-05-31 01:00:00,005 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 
2023-06-01 01:00:00,001 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 
2023-06-02 01:00:00,002 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 
2023-06-03 01:00:00,002 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 
2023-06-04 01:00:00,001 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 
2023-06-05 01:00:00,003 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 
2023-06-06 01:00:00,002 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 

bkcsfi avatar Jun 07 '23 00:06 bkcsfi

@bkcsfi where is the backup stored? inside the container or to container volume?

duncdrum avatar Jun 15 '23 15:06 duncdrum

Hi,

The backups are outside of the container, accessed through a volume mount.

I restarted eXist last night, it had been running for 3 weeks. This resolved the issue with "Consistency check is paused".

Instead, the 1am check task worked:

2023-06-15 01:00:00,026 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:152) - Starting consistency check... 
2023-06-15 01:00:20,727 [db.exist.scheduler.quartz-worker-3] ERROR (ConsistencyCheckTask.java [execute]:165) - Errors found: 1 
2023-06-15 01:00:20,727 [db.exist.scheduler.quartz-worker-3] ERROR (ConsistencyCheckTask.java [execute]:170) - Fatal errors were found: pausing the consistency check task. 
2023-06-15 01:00:20,727 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:175) - Finished consistency check 
2023-06-15 01:00:20,727 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-15 01:01:54,146 [db.exist.scheduler.quartz-worker-3] ERROR (ConsistencyCheckTask.java [error]:250) - EXPORT: Skipping damaged document summary.xml 
2023-06-15 01:01:56,082 [db.exist.scheduler.quartz-worker-3] ERROR (ConsistencyCheckTask.java [error]:250) - EXPORT: Found an orphaned document: summary.xml 
2023-06-15 01:01:56,281 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/full20230615-0100.zip 
2023-06-15 01:01:56,282 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 

I will monitor the system for a few more days and report back if the restart also resolved the 1+ hour backup time.

bkcsfi avatar Jun 15 '23 15:06 bkcsfi

Following up on this.

I see that "consistency check is paused" problem is back

2023-06-21 01:00:00,002 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 
2023-06-21 19:00:00,001 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-21 19:20:20,571 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230621-1900.zip 
2023-06-21 19:20:20,571 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-22 01:00:00,002 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:138) - Consistency check is paused. 

here's where we stand on length of time for backup to complete after a full. Reminder that eXist was restarted on 6-14

The first incremental after a full backup still takes 1 hour, on 6-15 19:00 and 6-20 19:00

Is that normal?

2023-06-11 20:17:55,050 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230611-1900.zip 
2023-06-11 20:17:55,050 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-12 19:00:00,001 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-12 19:25:44,188 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230612-1900.zip 
2023-06-12 19:25:44,189 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-13 19:00:00,002 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-13 19:28:55,479 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230613-1900.zip 
2023-06-13 19:28:55,480 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-14 19:00:00,001 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-14 19:29:08,932 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230614-1900.zip 
2023-06-14 19:29:08,932 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-15 01:00:20,727 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-15 01:01:56,281 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/full20230615-0100.zip 
2023-06-15 01:01:56,282 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-15 19:00:00,002 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-15 20:01:51,046 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230615-1900.zip 
2023-06-15 20:01:51,046 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-16 19:00:00,002 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-16 19:26:19,118 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230616-1900.zip 
2023-06-16 19:26:19,119 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-17 19:00:00,008 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-17 19:25:25,934 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230617-1900.zip 
2023-06-17 19:25:25,935 [db.exist.scheduler.quartz-worker-4] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-18 19:00:00,002 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-18 19:23:45,555 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230618-1900.zip 
2023-06-18 19:23:45,555 [db.exist.scheduler.quartz-worker-1] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-19 19:00:00,001 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-19 19:01:17,326 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/full20230619-1900.zip 
2023-06-19 19:01:17,326 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 
2023-06-20 19:00:00,002 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-20 19:59:59,220 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:186) - Created backup to file: /backup/exist2303/inc20230620-1900.zip 
2023-06-20 19:59:59,220 [db.exist.scheduler.quartz-worker-2] INFO  (ConsistencyCheckTask.java [execute]:189) - Finished backup 

bkcsfi avatar Jun 22 '23 18:06 bkcsfi

I found this log, which confirms why the consistency check is paused.

How do I fix this problem? There are hundreds of documents named summary.xml in the database.

2023-06-15 01:00:00,026 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:152) - Starting consistency check... 
2023-06-15 01:00:20,727 [db.exist.scheduler.quartz-worker-3] ERROR (ConsistencyCheckTask.java [execute]:165) - Errors found: 1 
2023-06-15 01:00:20,727 [db.exist.scheduler.quartz-worker-3] ERROR (ConsistencyCheckTask.java [execute]:170) - Fatal errors were found: pausing the consistency check task. 
2023-06-15 01:00:20,727 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:175) - Finished consistency check 
2023-06-15 01:00:20,727 [db.exist.scheduler.quartz-worker-3] INFO  (ConsistencyCheckTask.java [execute]:179) - Starting backup... 
2023-06-15 01:01:54,146 [db.exist.scheduler.quartz-worker-3] ERROR (ConsistencyCheckTask.java [error]:250) - EXPORT: Skipping damaged document summary.xml 
2023-06-15 01:01:56,082 [db.exist.scheduler.quartz-worker-3] ERROR (ConsistencyCheckTask.java [error]:250) - EXPORT: Found an orphaned document: summary.xml 

bkcsfi avatar Jun 22 '23 18:06 bkcsfi

Phew isn’t there a folder inside the backup where these files are moved? Inspecting the contents outside exist might give you an idea.

Inside the exist instance I would search for files with that name stored as binary. You might not be able to open them, but if there are any. Those are likely the problem files.

None of these is a fool proof method though.

duncdrum avatar Jul 04 '23 20:07 duncdrum

Hi,

Are you sure that corrupted xml documents will have a 'type' of 'binary' when inspected from an xquery? e.g. will get-mime-type() return something other than text/xml for corrupted xml files?

bkcsfi avatar Jul 04 '23 21:07 bkcsfi

As we don’t know what’s wrong with these files no we can’t be sure.

I have seen corrupt .xml files reporting binary file type in exist. So if you have any those are culprits. Be warned though opening or processing such files inside exist, will lead to crashes.

duncdrum avatar Jul 04 '23 21:07 duncdrum

Is it possible the check consistency task to report the path of the 'bad' document in a future release?

bkcsfi avatar Jul 04 '23 21:07 bkcsfi

Please open a feature request. I m not aware of a way to do that atm. But it might exist. Having that as a separate issue will help us figure that out.

duncdrum avatar Jul 04 '23 21:07 duncdrum