attic icon indicating copy to clipboard operation
attic copied to clipboard

Parallel backups from different hosts crash attic

Open gullevek opened this issue 9 years ago • 7 comments

Attic: debian package 0.13-1 The target is one shared archive for all hosts. Backup is via local lan and ssh, except one host which uses local direct backup.

I want to run backups from several hosts via ssh to one backup host. Running those backups at the same time will crash attic:

Traceback (most recent call last): File "/usr/bin/attic", line 3, in main() File "/usr/lib/python3/dist-packages/attic/archiver.py", line 715, in main exit_code = archiver.run(sys.argv[1:]) File "/usr/lib/python3/dist-packages/attic/archiver.py", line 705, in run return args.func(args) File "/usr/lib/python3/dist-packages/attic/archiver.py", line 129, in do_create archive.save() File "/usr/lib/python3/dist-packages/attic/archive.py", line 196, in save self.manifest.write() File "/usr/lib/python3/dist-packages/attic/helpers.py", line 121, in write self.repository.put(self.MANIFEST_ID, self.key.encrypt(data)) File "/usr/lib/python3/dist-packages/attic/remote.py", line 234, in put return self.call('put', id_, data, wait=wait) File "/usr/lib/python3/dist-packages/attic/remote.py", line 127, in call for resp in self.call_many(cmd, [args], **kw): File "/usr/lib/python3/dist-packages/attic/remote.py", line 158, in call_many raise self.RPCError(error) attic.remote.RPCError: b'AttributeError'

OR

Traceback (most recent call last): File "/usr/bin/attic", line 3, in main() File "/usr/lib/python3/dist-packages/attic/archiver.py", line 715, in main exit_code = archiver.run(sys.argv[1:]) File "/usr/lib/python3/dist-packages/attic/archiver.py", line 705, in run return args.func(args) File "/usr/lib/python3/dist-packages/attic/archiver.py", line 128, in do_create self._process(archive, cache, args.excludes, args.exclude_caches, skip_inodes, path, restrict_dev) File "/usr/lib/python3/dist-packages/attic/archiver.py", line 177, in _process os.path.join(path, filename), restrict_dev) File "/usr/lib/python3/dist-packages/attic/archiver.py", line 177, in _process os.path.join(path, filename), restrict_dev) File "/usr/lib/python3/dist-packages/attic/archiver.py", line 177, in _process os.path.join(path, filename), restrict_dev) File "/usr/lib/python3/dist-packages/attic/archiver.py", line 163, in process archive.process_file(path, st, cache) File "/usr/lib/python3/dist-packages/attic/archive.py", line 407, in process_file self.add_item(item) File "/usr/lib/python3/dist-packages/attic/archive.py", line 170, in add_item self.write_checkpoint() File "/usr/lib/python3/dist-packages/attic/archive.py", line 174, in write_checkpoint self.save(self.checkpoint_name) File "/usr/lib/python3/dist-packages/attic/archive.py", line 196, in save self.manifest.write() File "/usr/lib/python3/dist-packages/attic/helpers.py", line 121, in write self.repository.put(self.MANIFEST_ID, self.key.encrypt(data)) File "/usr/lib/python3/dist-packages/attic/remote.py", line 234, in put return self.call('put', id, data, wait=wait) File "/usr/lib/python3/dist-packages/attic/remote.py", line 127, in call for resp in self.call_many(cmd, [args], **kw): File "/usr/lib/python3/dist-packages/attic/remote.py", line 158, in call_many raise self.RPCError(error) attic.remote.RPCError: b'AttributeError'

I need to run them one after each other. This should be fixed.

gullevek avatar Nov 21 '14 01:11 gullevek

@gullevek can you still reproduce this with a current attic version?

A problem is that the traceback does not contain enough information about the AttributeError that happened on the remote side.

ThomasWaldmann avatar Oct 16 '15 14:10 ThomasWaldmann

A problem is that the traceback does not contain enough information about the AttributeError that happened on the remote side.

slightly offtopic, but can you tell me how to get a traceback from the remote side (in general)? Does the attic serve call log its messages to some particular place by default?

mgrachten avatar Nov 07 '15 20:11 mgrachten

@mgrachten AFAIK this is not the case.

ThomasWaldmann avatar Nov 07 '15 22:11 ThomasWaldmann

I just installed attic for the backup of all my systems (around 10 computers). all run debian jessie and thus attic in debian version 0.13-1. the backups are simply done by remote ssh to the same host with the same attic archive. All backups start at the same time with a command like this on every host via cronjob:

attic create --do-not-cross-mountpoints [email protected]:/backups/attic::goofy.blabla.lan_root /

but i get a lot of errors like shown in the above conversations.

are there some issues with running attic in parallel? do i have to add some locking? any ideas?

actually i am a little disappointed that such a basic issue is open. attic looked so professional and promising to me, and now it completely fails right in the first night. the resulting archive is corrupted. at least it should be documented that manual selfmade locking is mandatory?

I just checked the system by running the backup scripts one after another manually, and all is fine. it only fails when run in parallel.

next week i will probably add manual locking to the backup scripts.

erikhydro avatar Jun 24 '16 08:06 erikhydro

@erikhydro have a look at borgbackup (which is a fork of attic), a lot of issues are fixed there. iirc it is available from the backports repo. there is also a single-file binary.

but "parallel" backups to same repository will be serialized (as it should be with attic also) and will also require re-synchronization of the chunks index, so it depends whether one or separate repos are the better idea.

ThomasWaldmann avatar Jun 24 '16 10:06 ThomasWaldmann

https://attic-backup.org/quickstart.html#remote-repositories here i can read, that i can use a remote repository by mounting and then using local attic, but then i must make sure that no two instances are running in parallel. That i have read and understood. But i concluded, that when using ssh and not a local mounted repository, then i could run in parallel.

can you confirm, that attic does NOT serialize and thus will corrupt the archive? so it is not somehow my error?

if yes i will commit another bug onto the debian bugtracker. there is already a grave one: https://bugs.debian.org/cgi-bin/pkgreport.cgi?dist=jessie;package=attic because i have the strong feeling that administrators should really know that they can NOT use attic by ssh to backup their server farm! (if not taking extra care to not run parallel, but that's so dangerous to forget. there really should be a report.)

erikhydro avatar Jun 24 '16 15:06 erikhydro

If you don't have working posix locks (due to using a filesystem that does not correctly support them), you are out of luck with attic. I assumed you have client/server attic after seeing the commandline you posted.

That is the main reason why borg uses mkdir-based locking. That basically works everywhere, even on rather crappy filesystems. It has one disadvantage though: the lock does not automatically go away when the process gets killed rather violently (kill -9, power failure, ...).

ThomasWaldmann avatar Jun 24 '16 15:06 ThomasWaldmann