docker-borgmatic icon indicating copy to clipboard operation
docker-borgmatic copied to clipboard

assistance to debug borg segfault

Open ramcq opened this issue 3 years ago • 4 comments

I've got borgmatic set up as an additional container in a Mailcow containerised setup, running on docker 20.10.5 on an Intel Xeon system running Debian bullseye - along these lines: https://mailcow.github.io/mailcow-dockerized-docs/third_party-borgmatic/

I believe I am seeing https://github.com/borgbackup/borg/issues/5899:

borgmatic-mailcow_1  | crond: USER root pid  64 cmd PATH=$PATH:/usr/bin /usr/bin/borgmatic --stats -v 0 2>&1
borgmatic-mailcow_1  | Fatal Python error: Segmentation fault
borgmatic-mailcow_1  | Current thread 0x00007fa3e65ceb48 (most recent call first):
borgmatic-mailcow_1  |   File "/usr/lib/python3.9/borg/cache.py", line 740 in write_archive_index
borgmatic-mailcow_1  |   File "/usr/lib/python3.9/borg/cache.py", line 736 in fetch_and_build_idx
borgmatic-mailcow_1  |   File "/usr/lib/python3.9/borg/cache.py", line 824 in create_master_idx
borgmatic-mailcow_1  | [email protected]:repo: Error running actions for repository
borgmatic-mailcow_1  | Command 'borg prune --keep-hourly 24 --keep-daily 7 --keep-weekly 4 --keep-monthly 6 --prefix {hostname}- --stats [email protected]:repo' died with <Signals.SIGSEGV: 11>.
borgmatic-mailcow_1  | /etc/borgmatic.d/config.yaml: Error running configuration file
borgmatic-mailcow_1  | 
borgmatic-mailcow_1  | summary:
borgmatic-mailcow_1  | /etc/borgmatic.d/config.yaml: Error running configuration file
borgmatic-mailcow_1  | [email protected]:repo: Error running actions for repository
borgmatic-mailcow_1  | Fatal Python error: Segmentation fault
borgmatic-mailcow_1  | Current thread 0x00007fa3e65ceb48 (most recent call first):
borgmatic-mailcow_1  |   File "/usr/lib/python3.9/borg/cache.py", line 740 in write_archive_index
borgmatic-mailcow_1  |   File "/usr/lib/python3.9/borg/cache.py", line 736 in fetch_and_build_idx
borgmatic-mailcow_1  |   File "/usr/lib/python3.9/borg/cache.py", line 824 in create_master_idx
borgmatic-mailcow_1  | Command 'borg prune --keep-hourly 24 --keep-daily 7 --keep-weekly 4 --keep-monthly 6 --prefix {hostname}- --stats [email protected]:repo' died with <Signals.SIGSEGV: 11>.
borgmatic-mailcow_1  | 
borgmatic-mailcow_1  | Need some help? https://torsion.org/borgmatic/#issues

Not to repeat the borg bug here, but do you have any suggestions how to instrument the container and Python with debug tools & symbols? I can use gdb, strace, valgrind etc but have not used within a container environment - I'm a relative docker novice and a complete newcomer to Alpine but as I am able to reproduce this relatively rare intermittent issue, it might help to solve the bug if I can get some useful debug information out.

ramcq avatar Jan 18 '22 21:01 ramcq

You should be able to exec into the borgmatic container and install additional debug tools using apk

Using docker compose

$ docker-compose exec borgmatic sh
$ apk add gdb

I'm unfamiliar with these debug tools so am unsure if it's sufficient to install them in the container to review the segfault

grantbevis avatar Jan 19 '22 06:01 grantbevis

Any updates @ramcq?

grantbevis avatar Feb 12 '22 14:02 grantbevis

same here, using the latest releases of borgmatic and borg on alpine linux (latest):

borg[3242478]: Fatal Python error: Segmentation fault
borg[3242478]: Current thread 0x00007f8efcddbb48 (most recent call first):
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 769 in write_archive_index
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 765 in fetch_and_build_idx
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 853 in create_master_idx
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 899 in sync
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 493 in __init__
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 374 in local
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 383 in __new__
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/archiver.py", line 1522 in do_prune
borg[3242478]: REDACTED_REPO: Error running actions for repository
borg[3242478]: Command 'borg prune --keep-daily 7 --keep-hourly 0 --keep-monthly 12 --keep-weekly 4 --keep-yearly 2 --prefix matrix- --stats REDACTED_REPO' died with <Signals.SIGSEGV: 11>.
borg[3242478]: Error while creating a backup.
borg[3242478]: /etc/borgmatic.d/config.yaml: Error running configuration file
borg[3242478]: summary:
borg[3242478]: /etc/borgmatic.d/config.yaml: Error running configuration file
borg[3242478]: REDACTED_REPO: Error running actions for repository
borg[3242478]: Remote: Warning: Permanently added 'REDACTED_REPO_HOST' (ED25519) to the list of known hosts.
borg[3242478]: Fatal Python error: Segmentation fault
borg[3242478]: Current thread 0x00007f8efcddbb48 (most recent call first):
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 769 in write_archive_index
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 765 in fetch_and_build_idx
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 853 in create_master_idx
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 899 in sync
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 493 in __init__
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 374 in local
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/cache.py", line 383 in __new__
borg[3242478]:   File "/usr/lib/python3.10/site-packages/borg/archiver.py", line 1522 in do_prune
borg[3242478]: Command 'borg prune --keep-daily 7 --keep-hourly 0 --keep-monthly 12 --keep-weekly 4 --keep-yearly 2 --prefix matrix- --stats REDACTED_REPO' died with <Signals.SIGSEGV: 11>.

I suppose that's not a borg and borgmatic issue per se, but difference in build and runtime environments, similar issue in completely unrelated project has following comment:

Signals.SIGSEGV: 11 seems to be a common bug when you are using different libraries/environments for compiling and running the code. I find a similar issue in other projects such as nvvl, this might give you a hint to check the running/compiling libraries to fix the bug.

aine-etke avatar Jun 28 '22 08:06 aine-etke

looking at the upstream bug, this might be fixed in borg 1.2.2

horihel avatar Aug 08 '22 05:08 horihel