exist icon indicating copy to clipboard operation
exist copied to clipboard

[BUG] Log and Journal growing when restoring DB backups

Open dariok opened this issue 1 year ago • 1 comments

Describe the bug

When restoring an extensive dump, everything is handled as one big transaction. Hence, the journal grows steadily, and sometimes too far.

See the following dir listing of ${eXist-Dir}/data:

total 134G
drwxr-xr-x  6 kampkaspar kampkaspar 4.0K Nov 13 09:38 .
drwxr-xr-x  8 kampkaspar kampkaspar 4.0K Nov 11 11:51 ..
-rw-r--r--  1 kampkaspar kampkaspar  72G Nov 13 14:48 0000000000.log
drwxr-xr-x  3 kampkaspar kampkaspar 160K Nov 13 09:43 blob
-rw-r--r--  1 kampkaspar kampkaspar  55K Nov 13 09:43 blob.dbx
-rw-r--r--  1 kampkaspar kampkaspar 2.3M Nov 13 14:47 collections.dbx
-rw-r--r--  1 kampkaspar kampkaspar   16 Nov 13 14:48 dbx_dir.lck
-rw-r--r--  1 kampkaspar kampkaspar  35G Nov 13 14:48 dom.dbx
drwxr-xr-x 12 kampkaspar kampkaspar 4.0K Nov 13 09:15 expathrepo
-rw-r--r--  1 kampkaspar kampkaspar   16 Nov 13 14:48 journal.lck
drwxr-xr-x  3 kampkaspar kampkaspar 4.0K Nov 13 14:45 lucene
-rw-r--r--  1 kampkaspar kampkaspar 8.0K Nov 13 09:14 ngram.dbx
drwxr-xr-x  3 kampkaspar kampkaspar 4.0K Nov 13 09:41 range
-rw-------  1 kampkaspar kampkaspar  453 Nov 13 09:38 restxq.registry
-rw-r--r--  1 kampkaspar kampkaspar 8.0K Nov 13 09:14 sort.dbx
-rw-r--r--  1 kampkaspar kampkaspar  22G Nov 13 14:48 structure.dbx
-rw-r--r--  1 kampkaspar kampkaspar  26K Nov 13 09:43 symbols.dbx
-rw-r--r--  1 kampkaspar kampkaspar 6.5G Nov 13 14:48 values.dbx

This will likely, at some point, overrun the available space in most environments, especially as the journal accounts for more than half of the total size of data/. Note that at this point, I was perhaps half-way (maybe two thirds) through the restore; the size of the dump is about 23G – so rough estimate would be that a restore currently needs about a factor of 10 for the journal alone.

Expected behavior The journal should not grow unboundedly.

To Reproduce Restore a dump into a DB where there is less than ~ 10*${Dump-Size} space available.

Context (please always complete the following information)

  • Build: [eXist-6.x.x-develop]
  • Java: [1.8.0_432]
  • OS: [Linux 6.11.6-2-default (@OpenSuSE Tumbleweed)]

Additional context

  • How is eXist-db installed? [build from 6.x.x-develop]
  • Any custom changes in e.g. conf.xml? [none]

dariok avatar Nov 21 '24 11:11 dariok

One idea certainly would be to commit the transaction at some convenient point during the restore.

The question, of course, is what that point would be. Most likely collections (as in: when restoring a certain collection has finished, the transaction is committed, limiting the transaction to that very collection).

dariok avatar Nov 21 '24 11:11 dariok