borg icon indicating copy to clipboard operation
borg copied to clipboard

Aborting (Ctrl + C) borg check leaves a lock

Open sophie-h opened this issue 3 years ago • 14 comments

Have you checked borgbackup docs, FAQ, and open Github issues?

Yes

Is this a BUG / ISSUE report or a QUESTION?

BUG / ISSUE report

System information. For client/server mode post info for both machines.

Not so easy, since borg serve has no support for this?

Your borg version (borg -V).

$ borg --version
borg 1.1.17

Operating system (distribution) and version.

$ uname -a
Linux desktop 5.14.0-4-amd64 #1 SMP Debian 5.14.16-1 (2021-11-03) x86_64 GNU/Linux

Hardware / network configuration, and filesystems used.

How much data is handled by borg?

Full borg commandline that lead to the problem (leave away excludes and passwords)

borg check

Describe the problem you're observing.

Execute borg check and press Ctrl + C. Next operation on the repo yields

Failed to create/acquire the lock /lock.exclusive (timeout).

Can you reproduce the problem? If so, describe how. If not, describe troubleshooting steps you took before opening the issue.

See above.

Include any warning/errors/backtraces from the system logs

sophie-h avatar Nov 17 '21 01:11 sophie-h

Local or remote or both kinds of repos?

ThomasWaldmann avatar Nov 17 '21 13:11 ThomasWaldmann

Also it would be good to have a traceback from Ctrl-C when it leaves behind the lock.

ThomasWaldmann avatar Nov 17 '21 13:11 ThomasWaldmann

Sound like I have two (#6038) different issues with check that only occur with my remote backup which are not reproducible for others? I will try to find a different remote to test there as well.

$ /usr/bin/borg check --debug --bypass-lock backup-server:~/backup-laptop
using builtin fallback logging configuration
35 self tests completed in 0.06 seconds
SSH command line: ['ssh', 'backup-server', 'borg', 'serve', '--umask=077', '--debug']
^CRemoteRepository: 172 B bytes sent, 66 B bytes received, 3 messages sent
Connection closed by remote host
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 177, in wrapper
    return method(self, args, repository=repository, **kwargs)
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 344, in do_check
    if not repository.check(repair=args.repair, save_space=args.save_space):
  File "/usr/lib/python3/dist-packages/borg/remote.py", line 477, in do_rpc
    return self.call(f.__name__, named, **extra)
  File "/usr/lib/python3/dist-packages/borg/remote.py", line 712, in call
    for resp in self.call_many(cmd, [args], **kw):
  File "/usr/lib/python3/dist-packages/borg/remote.py", line 807, in call_many
    r, w, x = select.select(self.r_fds, w_fds, self.x_fds, 1)
  File "/usr/lib/python3/dist-packages/borg/helpers.py", line 2274, in handler
    raise exc_cls
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 4703, in main
    exit_code = archiver.run(args)
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 4635, in run
    return set_ec(func(args))
  File "/usr/lib/python3/dist-packages/borg/archiver.py", line 177, in wrapper
    return method(self, args, repository=repository, **kwargs)
  File "/usr/lib/python3/dist-packages/borg/remote.py", line 634, in __exit__
    self.rollback()
  File "/usr/lib/python3/dist-packages/borg/remote.py", line 477, in do_rpc
    return self.call(f.__name__, named, **extra)
  File "/usr/lib/python3/dist-packages/borg/remote.py", line 712, in call
    for resp in self.call_many(cmd, [args], **kw):
  File "/usr/lib/python3/dist-packages/borg/remote.py", line 814, in call_many
    raise ConnectionClosed()
borg.remote.ConnectionClosed: Connection closed by remote host

Platform: Linux desktop 5.14.0-4-amd64 #1 SMP Debian 5.14.16-1 (2021-11-03) x86_64
Linux: Unknown Linux  
Borg: 1.1.17  Python: CPython 3.9.8 msgpack: 0.5.6.+borg1
PID: 59868  CWD: /home/herold
sys.argv: ['/usr/bin/borg', 'check', '--debug', '--bypass-lock', 'backup-server:~/backup-laptop']
SSH_ORIGINAL_COMMAND: None

sophie-h avatar Nov 17 '21 17:11 sophie-h

Both issues seem to be exclusive to my Hetzner Storage box. Unfortunately ssh backup-server borg serve --show-version --debug does not give me any output which is already strange I guess? So I'm not sure if I can find out the version that they are running.

What confuses me is that a lot of borg features work without any problems here. Any ideas?

sophie-h avatar Nov 17 '21 17:11 sophie-h

I'ld suspect hetzner runs borg serve as a forced command from .ssh/authorized_keys - this might influence what commands / options you are able to invoke from your side.

ThomasWaldmann avatar Nov 17 '21 18:11 ThomasWaldmann

And you should never use --bypass-lock except if you are very very sure about what you are doing (see docs also).

ThomasWaldmann avatar Nov 17 '21 18:11 ThomasWaldmann

I'ld suspect hetzner runs borg serve as a forced command from .ssh/authorized_keys - this might influence what commands / options you are able to invoke from your side.

Sure. But how does this explain that borg check works find except missing some of the messages and having an odd behavior with Ctrl + C? I will ask Hetzner if they know what's going on with their setup.

sophie-h avatar Nov 17 '21 18:11 sophie-h

was there some useful feedback from hetzner?

ThomasWaldmann avatar Jan 22 '22 16:01 ThomasWaldmann

Paraphrasing Hetzner support comment on this

We can reproduce this problem. However, this currently looks like a borg issue. On the server-side, the 'borg serve' process is still running after the SSH session is closed. After a few seconds (or more for a larger repo) the process terminates and the lock is removed.

Sounds like this could be expected behavior?

sophie-h avatar Feb 09 '22 20:02 sophie-h

OK, so we are not talking about a permanently left-behind lock, but just that the process takes some time until it terminates.

I'ld say this is expected and not a bug.

If someone wants to invest some time into this, it could be analysed more precisely how long it takes and why exactly it takes that time.

ThomasWaldmann avatar Feb 10 '22 02:02 ThomasWaldmann

I reproduced this as a permanent lock but only with Hetzner so far. Not sure what those guys are doing with check.

sophie-h avatar Jun 29 '22 17:06 sophie-h

maybe related: #6912

ThomasWaldmann avatar Aug 06 '22 07:08 ThomasWaldmann

Would be interesting if this can still be reproduced after the fix of #6912.

ThomasWaldmann avatar Jun 08 '23 21:06 ThomasWaldmann

BTW, while working on #7893 I did not experience left-over locks when using Ctrl-C (and not even when I used kill -9 to kill the client, even in that case the server side did not leave a lock).

ThomasWaldmann avatar Oct 27 '23 21:10 ThomasWaldmann