btrbk icon indicating copy to clipboard operation
btrbk copied to clipboard

(Possible) FAQ: How to Verify Backups?

Open ChrisJefferson opened this issue 6 years ago • 13 comments

Possible Question: How can I verify my backups are complete/valid?

Possible Answer (if this is right?) "To check for filesystem issues on your backup drive, use "btrbk scrub". If btrbk's stats report your backups are "correlated", that means btrfs is reporting they are identical.

Background: I ran btrbk and got some weird error messages. I ran it again and everything was fine, but I'm worried one or more of my backups might be damaged/incomplete. I ran scrub on both my drive and backup, which came back clean. Can I check the backups are complete?

btrbk stats gives:

/mnt/btr_pool/@                              41 snapshots
^-- /media/caj/backup/btr_backups/@.*        75 backups   (41 correlated)
/mnt/btr_pool/@home                          41 snapshots (up-to-date)
^-- /media/caj/backup/btr_backups/@home.*    75 backups   (up-to-date, 41 correlated)

Does the 'correlated' mean "This is definitely the same backup", or just "something has the same name"?

ChrisJefferson avatar Oct 19 '18 10:10 ChrisJefferson

See this comment: https://github.com/digint/btrbk/issues/161#issuecomment-317712919

You are right, it would make sense to add this to a FAQ entry. I'm not really happy with my rsync approach to check all files (as it's not nicely scriptable), that's why it's not there yet.

Does the 'correlated' mean "This is definitely the same backup", or just "something has the same name"?

"correlated" basically means "the received_uuid of A matches uuid of B" (or vice-versa):

https://github.com/digint/btrbk/blob/v0.27.0/btrbk#L2278

This implies that both subvolumes have equal data. But then if something went wrong (e.g. bug in kernel or btrfs-progs), or if some other tool was messing around with the received_uuid flag or changed the data (e.g. btrfs property set), btrbk will still say it's "correlated".

digint avatar Oct 23 '18 12:10 digint

You could use something like md5sun, although that would probably be slow compared to scrub. Compare the output on both volumes.

Perhaps a better way is to only md5sun changed files?

ghost avatar Oct 24 '18 09:10 ghost

Well md5sum only works on regular files, so any empty directory, symlink, etc. will not be checked. rsync does this in a much more complete way.

digint avatar Oct 24 '18 14:10 digint

You are of course correct. I was just merely referring to the comment about rsync beeing more difficult to script.

Using "btrbk diff" might be easy enough to script with some simple tools.

In any case, normally it is file data that is most important, and not other meta data stuff.

ghost avatar Oct 24 '18 17:10 ghost

I added a neat little script to automate backup checks in the btrbk-check branch:

https://github.com/digint/btrbk/blob/btrbk-check/contrib/tools/btrbk-check.sh

This will use rsync -i -n -c -a --delete --numeric-ids -H -A -X for each snapshot/backup pair listed by btrbk list latest.

Works for me, try:

cd /tmp/
wget https://raw.githubusercontent.com/digint/btrbk/btrbk-check/contrib/tools/btrbk-check.sh
chmod +x btrbk-check.sh
# dryrun, see what would be checked
./btrbk-check.sh -p -v -n
# real run, for a specific backup:
./btrbk-check.sh -p -v /mnt/btr_backup/data

Using "btrbk diff" might be easy enough to script with some simple tools.

I don't think it would be that easy, this involves having some database with per-file digests, and updating it for every changed file. Also this would probably make checks less reliable, as btrbk diff uses btrfs subvolume find-new for listing the diffs, which might produce the same errors as send/receive.

digint avatar Oct 28 '18 19:10 digint

I streamlined and merged the script to master: contrib/cron/btrbk-verify:

  • 9ed41c8937bee3d7b1e9bce32a1aa82a129e4bd6 btrbk-verify: tool for automated backup integrity check based on rsync
  • 805d7f4a0d33216de2b4e9fc4ad83c01d322b40c btrbk-verify: add workaround for btrbk <= 0.27.2 bug: missing target_rsh, target_type

Please read the comments in btrbk-verify for more details, and try:

cd /tmp/
wget https://raw.githubusercontent.com/digint/btrbk/master/contrib/cron/btrbk-verify
chmod +x btrbk-verify
./btrbk-verify -h

# check (dryrun) on "/mnt/btr_pool/mysvol"
./btrbk-verify latest /mnt/btr_pool/mysvol -v -v -n

# real run, for a specific backup:
./btrbk-verify latest /mnt/btr_pool/mysvol  -v

Note that if you have ssh targets, you need to have btrbk-0.28.0-dev from master:

wget https://raw.githubusercontent.com/digint/btrbk/master/btrbk
chmod +x btrbk
export PATH=/tmp/:$PATH
./btrbk-verify [...]

digint avatar Apr 01 '19 16:04 digint

How does rsync compare the files? Does it have to load the complete file or can it calculate a checksum on the remote machine and compare that? In other words: Does checking require all the bandwidth of a full backup?

raumi75 avatar Apr 09 '19 14:04 raumi75

rsync only transfers the checksum (MD5, 128bit) per file. So no, it does not require the bandwidth of a full backup. See rsync(1), --checksum for more details.

digint avatar Apr 09 '19 14:04 digint

I can't check it over ssh because on the server side, the ssh_filter_btrbk.sh is blocking the command

ERROR: ssh_filter_btrbk.sh: ssh command rejected: disallowed command...

I'm running the git version 20c390893a22d453da83d398e2eed8b585bd8668 (April 5, 19:29)

Is there an updated version?

raumi75 avatar Apr 11 '19 19:04 raumi75

This is a known "problem": Well, the fact is that rsync really really really needs root. That's why btrbk-verify has the --ssh-user and --ssh-identity options, ssh_filter_btrbk.sh is not suitable for this (as allowing to rsync means the same as allowing root access).

Make sure you have an SSH key which allows root access on the target, and run:

btrbk-verify --ssh-agent --ssh-user root --ssh-identity /etc/btrbk/ssh/id_ed25519 [...]

digint avatar Apr 11 '19 23:04 digint

This works great for most of my sub volumes. But there is one which keeps reporting errors

Here is one line of my btrbk-verify output:

[rsync] .f.......a. iw.git/objects/49/fc2ceb5745bf1d6dc1fe3ef6ed38a0eebd6c05   # FAIL ndiffs=7461

Curiously, the md5sum is identical on both sides of the backup

$ md5sum /mnt/btrfs_ssd/_btrbk_snap/@git.20190922T0050/iw.git/objects/49/fc2ceb5745bf1d6dc1fe3ef6ed38a0eebd6c05 /mnt/backuphd/_btrbk/@git.20190922T0050/iw.git/objects/49/fc2ceb5745bf1d6dc1fe3ef6ed38a0eebd6c05
acf51e26bf94287a477cbd4ecd6b0a77  /mnt/btrfs_ssd/_btrbk_snap/@git.20190922T0050/iw.git/objects/49/fc2ceb5745bf1d6dc1fe3ef6ed38a0eebd6c05
acf51e26bf94287a477cbd4ecd6b0a77  /mnt/backuphd/_btrbk/@git.20190922T0050/iw.git/objects/49/fc2ceb5745bf1d6dc1fe3ef6ed38a0eebd6c05

The only difference I can see seems to be the file permissions on the source and backup side

$ getfacl /mnt/btrfs_ssd/_btrbk_snap/@git.20190922T0050/iw.git/objects/49/fc2ceb5745bf1d6dc1fe3ef6ed38a0eebd6c05
getfacl: Removing leading '/' from absolute path names
# file: mnt/btrfs_ssd/_btrbk_snap/@git.20190922T0050/iw.git/objects/49/fc2ceb5745bf1d6dc1fe3ef6ed38a0eebd6c05
# owner: jan
# group: admin
user::r--
group::r--
other::---

$ getfacl /mnt/backuphd/_btrbk/@git.20190922T0050/iw.git/objects/49/fc2ceb5745bf1d6dc1fe3ef6ed38a0eebd6c05
getfacl: Removing leading '/' from absolute path names
# file: mnt/backuphd/_btrbk/@git.20190922T0050/iw.git/objects/49/fc2ceb5745bf1d6dc1fe3ef6ed38a0eebd6c05
# owner: jan
# group: admin
user::r--
group::rwx			#effective:r--
group:admin:rwx			#effective:r--
mask::r--
other::---

I wonder why the file permissions (including acl in my case) could have gotten out of sync? Is this a btrfs-related bug or did I configure something wrong?

raumi75 avatar Sep 22 '19 08:09 raumi75

btrbk-verify runs rsync with --acls option, you can disable this by running btrbk-verify --ignore-acls.

A reason for the ACL to mismatch would be that you dont mount btrfs with "acl" option one one side? Or not having ACL configured in the kernel?

Sorry for the late reply, hope this helps.

digint avatar Oct 27 '19 15:10 digint

Hello @digint, I just found this thread when I was looking for a way to verify btrbk backups. I tried your btrbk-verify and found that for some binaries in /usr/bin, rsync reported differences in xattrs although there did not even seem to be any xattrs set for these files.

I have posted this question (including more details) to unix.stackexchange.com at https://unix.stackexchange.com/questions/607032/rsync-itemize-changes-lists-some-xattr-differences-but-getfattr-shows-no-diff

Do you maybe have an idea what could cause this?

Btw, thank you for your excellent development on btrbk!

Edit: I have created an Issue for btrfs-progs for this topic: https://github.com/kdave/btrfs-progs/issues/292

Nox996 avatar Aug 30 '20 12:08 Nox996