borg Add --ignore-err16 N and --ignore-err13 N arguments

The goal here is that there are some folders where you will always get Error 16 (Device or resource busy) or Error 13 (permission denied). Since you will always get those two errors and you don't want to play whack-a-mole with excludes, maybe allowing borg to squash those two errors from causing a non-zero return value would be useful.

--ignore-err16 10

That would tell borg that unless it sees more then (10) Errno 16 occurrences, that any [Errno 16] occurrences should be treated as warnings and not result in a non-zero exit code. But once you hit the 11th event, start treating them as valid errors and let them affect the exit code.

The same approach could be taken for error 13 (permission denied, with --ignore-err13 N), or error 2 (locked files, with --ignore-err2 N).

This means that if you just backup /cygwin/c/Users/ - any locked files, permission issues, or device/resource busy errors could be ignored -- unless you get a large amount of them, which would indicate that a more serious issue then expected has occurred.

I generally don't care if a few files don't get backed up. And I don't want to be pestered daily because the backup program tells me about them every single day. But I do care if hundreds of files did not get backed up, because that probably indicates that something more serious went wrong.

Jul 31 '15 11:07 tgharold

Hmm, guess I am not really convinced this is helpful.

Basically one is saying "10 fails are ok, but 11 not". How about just running the backup once, taking the all the pathes reported as not readable and adding them to the exclude list? This would make very sure that only the stuff you knowingly excluded is skipped and every other read failure is reported as a warning (== telling your backup is incomplete).

In the long run, I think we need a VSS snapshotting wrapper for borg on windows to avoid such kinds of troubles and also to get better consistency.

Jul 31 '15 13:07 ThomasWaldmann

That's exactly what I'm saying. To step back a bit there are a few ways to deal with error conditions that you know will happen. (Usually because you don't have 100% control over the environment, i.e. you can't force users to logout before the backups run.)

A) Treat all errors as errors, i.e. do nothing. This results in a mailbox full of "failed" backup reports making the admin's life harder unless there is some way to filter by severity.

B) Play whack-a-mole with the backup definition. This means running the backup again and again and again until you figure out all of the possible directories that might cause errors and exclude them from the script. But maybe the errors only happen 50% of the time, because of "open files", so you have to create a separate backup to try and back those files up on an opportunistic basis.

This also requires a much more complex backup specification.

C) Ignore all errors of a certain type. Similar to turning off compiler errors (or changing them to warnings), it's a blunt approach, but works. Now you don't get any errors, just lots of warnings. That would be equivalent to a "--ignore-err-##" without the ability to specify a limit.

The downside here is that you have to monitor the backup to see if the count of error type X suddenly spiked.

D) Treat errors of a particular type as warnings, until some threshold is passed, after which that particular error gets treated as an error. A more nuanced approach then (C) as it lets you set a pain threshold before you want to be notified about the problem. Maybe you do know that there will always be approximately 5 errors of type X, so you set the threshold for 10.

Aug 02 '15 00:08 tgharold

Implementation might not be too difficult. Assuming that you have a dictionary/list which lets you look up values by a key.

Initialize two lists (or a list with multiple values for each key). One is a count of errors that have occurred so far. The other value to track is what the threshold is before those errors stop being treated as warnings. The threshold for each error number would default to zero. Number of entries in the list would be equal to the number of error conditions that you are willing to support.
When an error occurs. Increment the counter associated with that error number, then check against the error threshold. If (count < threshold) then spit it out as a warning. Else, spit it out as an error (and set the exit code).

The tracking of error counts could also be used in the --stats output to indicate: how many errors of each different type occurred (for counts > 0)

Aug 02 '15 00:08 tgharold

KISS: --ignore-errnos 16,13,1234 (or maybe with names errno.errorcode has them)

Jul 16 '16 18:07 enkore

Some threshold would be useful for me, too -- this would be some way to separate "every-day errors = small number of files skipped" and "massive problem = nothing backed up".

I agree that having this kind of threshold is an ugly workaround, but I'm looking for a similar solution. Error 16 is thrown for files in use (mounted NTFS volume). For me this occurs by random because those files are opened by some application or service for an undefined (maybe short) period of time.

(BTW: is there some wait+retry in borg like /R:n and /W:n in robocopy? Didn't find something in docs.)

From user's point of view this might be no error if the file was in a backup within the last N days. I don't think this is something that borg should do internally. Would be useful but kind of very special.

Maybe support this usage e.g. by separating these error from more serious ones. @enkore' suggestion --ignore-errnos would help here. Skipped files should be logged anyway. Maybe add some special "skipped-files.lst" output? You could wc -l on it to evaluate threshold or diff it with previous run's output to identify recurring errors.

Aug 08 '17 09:08 StefanBertels

@StefanBertels these "sometimes unaccessible files" is windows-specific (sometimes for known, sometimes for unknown reasons), that usually never happens on linux. We do not have wait/retry for such cases (yet?) but that could be the topic of a separate issue.

We currently only have 1 log stream (usually stderr, but configurable), but we could use a separate logger, so it could be configured differently. Also a separate ticket for discussion.

Aug 08 '17 15:08 ThomasWaldmann

I have a different idea for an implementation, though I may be pushing my luck here in terms of complexity.

Allow the user to submit a list of files, similar to the --exclude and --exclude-from options, for which errors may be ignored, but which should be backed up if it's possible to do so without errors.

My main use case here is the various settings.dat files in the Users directory that Windows 10 uses for its internal apps. Once opened they are inexplicably kept open until the machine is restarted, but I believe that after a reboot they can be read and copied. And presumably they're worth copying, since they contain the settings I want to restore.

What I'm not sure about with this alternative is whether it should be possible to specify that only some error codes may be ignored.

Sep 04 '17 20:09 skyegecko

outlook .pst files are a fine example, you don't want to exclude them but sometimes they have to be skipped. Another option might be --skip-on-16 or something?

Jan 04 '18 15:01 tuxick

I Like dutchgecko s idea. It would be very useful

Jan 04 '18 16:01 henfri

@tuxick borg's default behaviour when there is an error with a file is to emit a warning and continue with the next file. If there are warnings, rc changes from 0 to 1.

Jan 04 '18 19:01 ThomasWaldmann

Oh right, i better rephrase then: have an option to make it return 0 (or maybe better some particular value) when encountering/skipping specified error. I ran into this problem because my script only uploads stats when borg returns 0.

Jan 04 '18 19:01 tuxick

Working around this i noticed create --stats gives no output when rc != 0. So i ended up calling borg info when rc = 1, and check the rc of that instead.

Jan 05 '18 12:01 tuxick

@tuxick please file a separate bug for that.

Jan 05 '18 15:01 ThomasWaldmann

FWIW, I can live with the output (I can enable quiet mode, or grep them away), but the non-zero exit code is annoying when calling borg from a script.

Jun 24 '20 16:06 reitzig

borg 1.4 will have more detailed exit codes (RCs) for misc. conditions, see PR #7976 .

Specifically for this issue, there is mapping in BackupIO wrapper class (which is used to deal with OSErrors on source files):

            E_MAP = {
                errno.EPERM: BackupPermissionError,
                errno.EACCES: BackupPermissionError,
                errno.ENOENT: BackupFileNotFoundError,
                errno.EIO: BackupIOError,
            }

From frontends.rst docs:

    FileChangedWarning rc: 100
        {}: file changed while we backed it up

    BackupError rc: 102
        {}: backup error

    BackupRaceConditionError rc: 103
        {}: file type or inode changed while we backed it up (race condition, skipped file)

    BackupOSError rc: 104
        {}: {}

    BackupPermissionError rc: 105
        {}: {}

    BackupIOError rc: 106
        {}: {}

    BackupFileNotFoundError rc: 107
        {}: {}

The given error codes for these warnings are used only if all warnings were of same code (and there were no errors, which would override the warning rc).

Jan 01 '24 18:01 ThomasWaldmann

There is no code for EBUSY yet, but guess it can be added in a similar way.

Jan 01 '24 18:01 ThomasWaldmann

In borg 1.4-maint and master branches, this will use the new warning system and exit borg with a very specific return code if there only was this one kind of warning. For compatibility reasons this needs BORG_EXIT_CODES=modern at least in 1.4.

I think this is the better solution than adding an option to completely ignore issues, the return code can be checked in the wrapper and it can react accordingly.

I am aware that the current implementation is likely neither complete nor can it deal with more complex situations (like multiple different kinds of warnings occurring). PRs with improvements welcome.

Feb 24 '24 19:02 ThomasWaldmann

borg borg copied to clipboard

Add --ignore-err16 N and --ignore-err13 N arguments

borg
borg copied to clipboard