qubes-issues
qubes-issues copied to clipboard
[Contribution] qubes-incremental-backup-poc OR Wyng backup
Community Devs: @v6ak, @tasket @v6ak's PoC: https://github.com/v6ak/qubes-incremental-backup-poc @tasket's PoC: https://github.com/tasket/wyng-backup | Status update as of 2022-08-16: https://github.com/QubesOS/qubes-issues/issues/858#issuecomment-1217463303
Reported by joanna on 14 May 2014 10:38 UTC None
Migrated-From: https://wiki.qubes-os.org/ticket/858
Note to any contributors who wish to work on this issue: Please either ask for details or propose a design before starting serious work on this.
Modified by joanna on 14 May 2014 10:38 UTC
Comment by joanna on 14 May 2014 10:42 UTC Discussion here:
https://groups.google.com/d/msg/qubes-devel/Gcrb7KQVcMk/CK-saQU_1HYJ
@Rudd-O in https://github.com/QubesOS/qubes-issues/issues/1588#issuecomment-225452668
Honestly, the backup tool should be replaced by something like Duplicity. They get it right, they do incremental backups, they do encryption, and they do arbitrary targets, so it would be extremely easy to accommodate backing up to another VM, or even to damn S3 if we wanted to.
Ideally, the way I would see that working is:
- Take snapshot of file system containing VMs. Mount the snapshot somewhere, read-only.
- Execute duplicity pointing it to the read-only VM data source, and the target storage destination (VM or local mountpoint)
- Destroy snapshot.
This would also allow for backup of VMs that are running, so no need to shut down VMs during backup. I highly recommend we research replacing qvm-backup with Duplicity.
It looks like Duplicity supports making an incremental backup even when part of a file was changed (include diff of file, not full changed files). So indeed it may be good idea to somehow use it.
But mounting VM filesystem in dom0 is a big NO-GO. On the other hand, it may be good idea to simply run duplicity in the VM, and collect its output. Take a look at the linked discussion for an idea how to handle powered off VMs (in short: launch minimal "stubdomain" like system with access to its disk for the backup purpose).
Of course all this requires evaluation whether duplicity correctly handle encryption / validation. To not make things worse than they currently are...
On 06/12/2016 07:04 PM, Marek Marczykowski-Górecki wrote:
@Rudd-O https://github.com/Rudd-O in #1588 (comment) https://github.com/QubesOS/qubes-issues/issues/1588#issuecomment-225452668
Honestly, the backup tool should be replaced by something like Duplicity. They get it right, they do /incremental/ backups, they do /encryption/, and they do arbitrary targets, so it would be extremely easy to accommodate backing up to another VM, or even to damn S3 if we wanted to. Ideally, the way I would see that working is: 1. Take snapshot of file system containing VMs. Mount the snapshot somewhere, read-only. 2. Execute duplicity pointing it to the read-only VM data source, and the target storage destination (VM or local mountpoint) 3. Destroy snapshot. This would also allow for backup of VMs that are running, so no need to shut down VMs during backup. I highly recommend we research replacing qvm-backup with Duplicity.It looks like Duplicity supports making an incremental backup even when part of a file was changed (include diff of file, not full changed files). So indeed it may be good idea to somehow use it.
But mounting VM filesystem in dom0 is a big NO-GO. On the other hand, it may be good idea to simply run duplicity in the VM, and collect its output. Take a look at the linked discussion for an idea how to handle powered off VMs (in short: launch minimal "stubdomain" like system with access to its disk for the backup purpose).
No need to mount any VM filesystem in dom0. That is also not how Duplicity works either.
The way it would work with Duplicity, is a backend would need to be written. This backend would have two main functions, really:
- let Duplicity retrieve the encrypted rdiff database from the backup VM, so that Duplicity can locally (in dom0) compute the differences and store only the differences.
- let Duplicity push the encrypted full / differential backup files, as well as the updated encrypted rdiff database, to the backup VM.
See this:
https://bazaar.launchpad.net/~duplicity-team/duplicity/0.7-series/files/head:/duplicity/backends/
Later today, or perhaps tomorrow, I will use my Qubes bombshell-client to simulate an SSH connection to the backup VM, and see how Duplicity behaves using this false SSH. That can serve as a good starting point.
Rudd-O
http://rudd-o.com/
Better still:
https://bazaar.launchpad.net/~duplicity-team/duplicity/0.7-series/view/head:/duplicity/backends/ssh_pexpect_backend.py
Remarkably much like my Qubes Ansible plugin.
Rudd-O
http://rudd-o.com/
Can you explain more how exactly that would work?
- Where and at what level VM data would be retrieved (
private.imgfile, individual files from within?) - Where and at what level incremental data is computed?
Generally dom0 shouldn't be exposed to VM filesystem in any way (regardless of the form: mounting directly, accessing via sftp-like service etc). If incremental data needs to be computed at VM-file level, it should be done in separate VM and dom0 should treat the result as opaque blob. Also during restore.
On 06/12/2016 08:15 PM, Marek Marczykowski-Górecki wrote:
Can you explain more how exactly that would work?
- Where and at what level VM data would be retrieved (|private.img| file, individual files from within?)
Duplicity + hypotheticalplugin (from now on, HP) would never retrieve any VM data. It would only retrieve an encrypted file from the backup directory within the VM. This encrypted file contains a manifest of what exists as a backup, as well as a set of rdifflib diff informations.
- Where and at what level incremental data is computed?
Incremental data is computed in dom0 using both the rdifflib database and the actual contents of the snapshotted VM images. This should be safe because the database is stored encrypted, so the VM cannot tamper with it and induce a code execution in dom0.
Generally dom0 shouldn't be exposed to VM filesystem in any way (regardless of the form: mounting directly, accessing via sftp-like service etc). /If/ incremental data needs to be computed at VM-file level, it should be done in separate VM and dom0 should treat the result as opaque blob. Also during restore.
That cannot work because that involves copying the IMG files into the separate VM, and that would take forever.
Duplicity's backends are designed to make the storage opaque to Duplicity. The backend retrieves files and uploads files, and those files are encrypted by the time the data reaches the backend plugin. The backend does not care about anything else.
Rudd-O
http://rudd-o.com/
That cannot work because that involves copying the IMG files into the separate VM, and that would take forever.
Not necessary - it can be attached with qvm-block-like mechanism.
Anyway, IIUC you want to backup *.img files in dom0 using duplicity. This may work. My initial (maybe invalid) concern about input validation on restore is still stands.
Well, look at what I have got here:
(deleted cos buggy, see below for updated version)
I have a (shit) complete Qubes VM backend for duplicity which you can add to your dom0's duplicity backends directory, and then run something like this:
duplicity /var/lib/qubes qubesvm://backupvm/Backupsmountpoint/mydom0/qubesvms
Very early working prototype and building block. Enjoy!
I have some experience with Duplicity:
- It works, but I remember some problems when backup was interrupted. This might have been fixed, though. Alternatively, it is possible to add some sanity checks.
- It uses GPG for encryption. I am not sure about the defaults, but it is configurable. Both symmetric and asymmetric modes are supported. For some good reasons, I prefer the asymmetric mode for backups.
- I believe it is authenticated somehow, but I have checked the options some time ago, so I am not 100% sure.
- Provided that it is properly authenticated and authentication data are properly checked, I feel the @Rudd-O's approach to be basically correct. (Maybe some extra validation/sanitization might be needed for filenames.) The GPG in dom0 seems to be already part of the TCB, so I hope that it is included in the “critical security updates” provided for dom0 even after EOL of the Fedora version used in dom0.
- Data are stored in large blocks of roughly (or maybe exactly) the same (configurable) size, except the last one. So, this does not reveal the structure of files.
- Some metadata are AFAIR stored separately, which theoretically allows guessing average file size. However, this can be considered as a tradeoff between maximum privacy and minimum bandwidth usage. This tradeoff is reasonable for me.
- Incremental backups reveal the amount of data changed between backups, of course. Again, this is some tradeoff.
- It uses compression and I am not sure if this is configurable. This introduces some potential side channel. (See CRIME/BREACH attacks for more details.) However, with some reasonable usage, the side channel is likely to be too noisy to be practically useful. Note that the current Qubes backup system seems to have a similar side channel, except that incremental backups make the side channel inherently less noisy.
- Per-VM backup is a double-edged sword. On one hand, it eliminates some inter-VM attacks. On the other hand, it makes data-size based side channels less noisy. Maybe we could get advantages of both (perform per-VM compression first, then divide to blocks and encrypt), but this seems to be far-from-trivial to properly design and implement.
I would prefer using Duplicity per-VM, so one could exclude caches etc. from the backup. My idea is something like the following process invoked from dom0:
- Start a DVM (say BackupDVM) and optionally disable networking there.
- Attach the corresponding
private.imgto the BackupDVM. - Send a per-VM key (stored in dom0 and backed-up separately) to the BackupDVM.
- Mount the private.img (well, rather /dev/xvdi) in the BackupDVM.
- Run Duplicity (or maybe another similar app) in the BackupDVM. The backup storage would be some custom that sends commands over QubesRPC to BackupStorageVM. The BackupStorageVM would be just a proxy from QubesRPC to a real backup storage for multiple domains.
As you can see, this would require some data stored in dom0 to be also backed up, but this could be handled by some very similar mechanism. (It would probably run directly in dom0 rather than in dom0.)
Some implementation and security notes on those points:
- We need the VM's name for multiple purposes. This part seems to be harder than I thought, because there seems to be no proper way of starting a DVM and getting its name other than cloning and modifying
/usr/lib/qubes/qfile-daemon-dvm. - For standard VMs, this requires them to be off. For LVM-backed VMs, it is possible to make a CoW clone and backup the clone when running. There is one assumption for the CoW approach: I assume there is some consistency kept on unclean shutdown (e.g. you don't use data=writeback). You see the CoW made when VM is running would look exactly like a drive recovering from unclean shutdown.
- Currently not sure if this can be a raw key or if it has to be a password. But this is rather a minor concern.
- Sure, this exposes the BackupDVM to the VM's filesystem. However, the BackupDVM is used only for the one VM and then discarded. In case of some kernel vulnerabilities, a malicious VM could alter or remove its own old backups. It could not touch other VM's backups. Moreover, it would have access to the DVM, which might be some concern when multiple DVMs are implemented, especially for some untrusted Whonix-workstation-based VMs. In those cases, just another VM should be used here. (Maybe we could use something like qvm-trim-template, which seems to use a corresponsing template instead of standard DVM template.)
- The BackupDVM would contain something similar to the implementation by @Rudd-O , except that this would be rewritten to RPC, because BackupStorageVM shouldn't trust the corresponding BackupDVMs. The BackupStorageVM would prepend some VM-specific path to all the files (maybe some random identifier from a table stored in dom0 in order to prevent leaking VM names to the backup storages), validate the filenames (maybe
\A[a-z][a-z0-9.-]*\Zwould be enough for Duplicity) and send it to some actual Duplicity backup storage backend (e.g. rsync, SFTP, Amazon S3, Google Drive, …).
- Encryption. The ciphers used by GnuPG suck by default. You have to tell Duplicity about that. Use
--gpg-options '--cipher-algo=AES256 --digest-algo=SHA256'to fix that. - Compression is configurable. You can choose the compressor. Command line switch, in fact. I believe it's a GPG command line switch, but the man page of
duplicitydoes talk about it. - There's no need to be attaching and detaching disks to a backup VM or similar monkey business. With this setup, Duplicity can back up your image files to a backup VM painlessly and incrementally.
- The philosophy of Duplicity is simplicity -- point it to a source dir and a dest dir (for
qubesvm://see below) and Duplicity will back up the source dir to the dest dir. Presumably we would add value by creating a system of "profiles" (each defined as "what to back up, and where to") where the user can use a GUI or a config file to establish a routine backup plan (perhaps automatic) of one or more of these profiles. - Adapter for Duplicity in Dom0. Here is the latest version of the plugin that enables Duplicity to back up to a VM. Deposit at
/usr/lib64/python2.7/site-packages/duplicity/backends/qubesvmbackend.py. Then your destination URL in yourduplicitycommand line can bequbesvm://yourbackupvm/path/to/directory/to/store/backup. I backed up/var/lib/qubes(technically, a mounted snapshot of it) with no problem to aqubesvm://mybackupvm/mnt/disk/qubesbackupdirectory.
# -*- Mode:Python; indent-tabs-mode:nil; tab-width:4 -*-
#
# This file is NOT part of duplicity. It is an extension to Duplicity
# that allows the Qubes OS dom0 to back up to a Qubes OS VM.
#
# Duplicity is free software, and so is this file. This file is
# under the same license as the one distributed by Duplicity.
import os
import pipes
import subprocess
import duplicity.backend
from duplicity.errors import *
BLOCKSIZE = 1048576 # for doing file transfers by chunk
MAX_LIST_SIZE = 10485760 # limited to 10 MB directory listings to avoid problems
class QubesVMBackend(duplicity.backend.Backend):
"""This backend accesses files stored on a Qubes OS VM. It is intended to
work as a backed within a Qubes OS dom0 (TCB) for the purposes of using
Duplicity to back up to a VM. No special tools are required other than
this backend file itself installed to your Duplicity backends directory.
Missing directories on the remote (VM) side will NOT be created. It is
an error to try and back up to a VM when the target directory does
not already exist.
module URL: qubesvm://vmname/path/to/backup/directory
"""
def __init__(self, parsed_url):
duplicity.backend._ensure_urlparser_initialized()
duplicity.backend.urlparser.uses_netloc.append("qubesvm")
duplicity.backend.urlparser.clear_cache()
properly_parsed_url = duplicity.backend.ParsedUrl(parsed_url.geturl())
duplicity.backend.Backend.__init__(self, properly_parsed_url)
if properly_parsed_url.path:
self.remote_dir = properly_parsed_url.path
else:
self.remote_dir = '.'
self.hostname = properly_parsed_url.hostname
def _validate_remote_filename(self, op, remote_filename):
if os.path.sep in remote_filename or "\0" in remote_filename:
raise BackendException(
("Qubes VM %s failed: path separators "
"or nulls in destination file name %s") % (
op, remote_filename))
def _dd(self, iff=None, off=None):
cmd = ["dd", "status=none", "bs=%s" % BLOCKSIZE]
if iff:
cmd.append("if=%s" % iff)
if off:
cmd.append("of=%s" % off)
return cmd
def _execute_qvmrun(self, cmd, stdin, stdout):
subcmd = " ".join(pipes.quote(s) for s in cmd)
cmd = ["qvm-run", "--pass-io", "--", self.hostname, subcmd]
return subprocess.Popen(
cmd,
stdin=stdin,
stdout=stdout,
bufsize=MAX_LIST_SIZE,
close_fds=True
)
def put(self, source_path, remote_filename=None):
"""Transfers a single file to the remote side."""
if not remote_filename:
remote_filename = source_path.get_filename()
self._validate_remote_filename("put", remote_filename)
rempath = os.path.join(self.remote_dir, remote_filename)
cmd = self._dd(off=rempath)
fobject = open(source_path.name, "rb")
try:
p = self._execute_qvmrun(cmd,
stdin=fobject,
stdout=open(os.devnull))
except Exception, e:
raise BackendException(
"Qubes VM put of %s (as %s) failed: (%s) %s" % (
source_path.name, remote_filename, type(e), e))
finally:
fobject.close()
err = p.wait()
if err != 0:
raise BackendException(
("Qubes VM put of %s (as %s) failed: writing the "
"destination path exited with nonzero status %s") % (
source_path.name, remote_filename, err))
def get(self, remote_filename, local_path):
"""Retrieves a single file from the remote side."""
self._validate_remote_filename("get", remote_filename)
rempath = os.path.join(self.remote_dir, remote_filename)
cmd = self._dd(iff=rempath)
fobject = open(local_path.name, "wb")
try:
p = self._execute_qvmrun(cmd,
stdin=open(os.devnull),
stdout=fobject)
except Exception, e:
raise BackendException(
"Qubes VM get of %s (as %s) failed: (%s) %s" % (
remote_filename.name, local_path, type(e), e))
finally:
fobject.close()
err = p.wait()
if err != 0:
raise BackendException(
("Qubes VM get of %s (as %s) failed: writing the "
"destination path exited with nonzero status %s") % (
remote_filename.name, local_path, err))
def _list(self):
"""Lists the contents of the one duplicity dir on the remote side."""
cmd = ["find", self.remote_dir, "-maxdepth", "1", "-print0"]
try:
p = self._execute_qvmrun(cmd,
stdin=open(os.devnull, "rb"),
stdout=subprocess.PIPE)
except Exception, e:
raise BackendException(
"Qubes VM list of %s failed: %s" % (self.remote_dir, e))
data = p.stdout.read(MAX_LIST_SIZE)
p.stdout.close()
err = p.wait()
if err != 0:
raise BackendException(
("Qubes VM list of %s failed: list command finished "
"with nonzero status %s" % (self.remote_dir, err)))
if not data:
raise BackendException(
("Qubes VM list of %s failed: list command returned "
"empty" % (self.remote_dir,)))
filename_list = data.split("\0")
if filename_list[0] != self.remote_dir:
raise BackendException(
("Qubes VM list of %s failed: list command returned a "
"filename_list for a path different from the remote folder") % (
self.remote_dir,))
filename_list.pop(0)
if filename_list[-1]:
raise BackendException(
("Qubes VM list of %s failed: list command returned "
"wrongly-terminated data or listing was too long") % (
self.remote_dir,))
filename_list.pop()
filename_list = [ p[len(self.remote_dir) + 1:] for p in filename_list ]
if any(os.path.sep in p for p in filename_list):
raise BackendException(
("Qubes VM list of %s failed: list command returned "
"a path separator in the listing") % (
self.remote_dir,))
return filename_list
def delete(self, filename_list):
"""Deletes all files in the list on the remote side."""
if any(os.path.sep in p or "\0" in p for p in filename_list):
raise BackendException(
("Qubes VM delete of files in %s failed: delete "
"command asked to delete a file with a path separator "
"or a null character in the listing") % (
self.remote_dir,))
pathlist = [os.path.join(self.remote_dir, p) for p in filename_list]
cmd = ["rm", "-f", "--"] + pathlist
try:
p = self._execute_qvmrun(cmd,
stdin=open(os.devnull, "rb"),
stdout=open(os.devnull, "wb"))
except Exception, e:
raise BackendException(
"Qubes VM delete of files in %s failed: %s" % (
self.remote_dir, e))
err = p.wait()
if err != 0:
raise BackendException(
("Qubes VM delete of files in %s failed: delete "
"command finished with nonzero status %s") % (
self.remote_dir, err))
duplicity.backend.register_backend("qubesvm", QubesVMBackend)
Finally: I disagree with the direction of implementation which suggests we need to play Towers of Hanoi with a backup VM and attaching disk images to it. That's entirely unnecessary complication, and it also demands VMs be off for backup purposes. Entirely unnecessary and extremely discouraging of backup procedures.
The three step process is all that is necessary:
- Snapshot the container volume of
/var/lib/qubes, then mount somewhere. - Invoke duplicity with a wrapper W that sets the appropriate options to back up
/mnt/snapshot/var/lib/qubes(I believe there's even an option to trim the mountpoint out of the sent paths, much liketar --strip-components). - Unmount and destroy snapshot.
That is all that is needed.
Of course, the Qubes setting of which VMs to back up, plus options like -x that are supported in qvm-backup, would also be sensible things to support in the wrapper W.
Update: Duplicity is no good for the process, because differential backups slow to a crawl.
We need to research other stuff, like Attic.
I have not experienced such issue with Duplicity. I have managed to perform incremental backup on roughly 15GiB (after exclusions) of various data (many small files and few large files) in ~2 minutes even on HDD + dm-crypt. (Of course, this depends on size of changes.) Encryption and compression were enabled.
Maybe it skips many files just because of timestamps and diffs just few files.
So, I feel you must be doing something wrong. (No offense.)
What was your scenario? What files did you try to backup? (E.g., dom0's ~ with 1GiB of small files.) What was your drive setup? (E.g., 7200 RPM HDD with dm-crypt with ext4.) How much time did it take? Where do you store the backup? (Other VM using your storage backend? Well, theoretically, this should not matter so much for performance of scanning, as Duplicity caches metadata locally, AFAIK somewhere in ~/.cache. But I am curious.) Did you have the metadata cached? How much data did you add to the backup? (Well, if you backup ~ and you don't exclude ~/.cache, you might add Duplicity metadata to the backup, which could explain both some time and space penalty. I am not sure if Duplicity is smart enough to exclude this automagically.)
On 07/19/2016 05:45 PM, Vít Šesták wrote:
I have not experienced such issue with Duplicity. I have managed to perform incremental backup on roughly 15GiB (after exclusions) of various data (many small files and few large files) in ~2 minutes even on HDD + dm-crypt. (Of course, this depends on size of changes.) Encryption and compression were enabled.
My backup of a 60 GB disk image progressed essentially nothing over the course of five hours. I think there is an exponential slowdown after a certain size.
Rudd-O
http://rudd-o.com/
Inspired by lecture pointed by @Rudd-O in parallel ticket, what about this:
- Create full backup as usual, but for each data block written to the backup (or skipped because of being empty), compute a hash and store in some file in dom0. For each VM image, there will be file with a list of block hashes. Keep this file in dom0 (together with with reference to backup id), no need to include it in backup.
- When creating incremental backup, run our simple tar implementation but instead of creating sparse archive as "don't include zero-ed blocks", make it "don't include blocks already backed up in full archive" - based on hashes created in step 1. This looks like a one-liner change here.
- Add info to backup header that the backup is incremental over another (reference it with backup id)
- During restore, proceed with full backup as usual, then restore incremental one over it. Extracting sparse archive should do the job, the only thing to make sure is to not truncate the output file (in case of files) in the process.
That would be:
-
simple to implement
-
require minimal format change (full backup will be 100% compatible with the current format, incremental one - mostly)
-
easy to recover the data even without Qubes, something like this should do:
tar xSOf private.img.tar.full > private.img tar xSOf private.img.tar.incremental | dd of=private.img conv=sparse,notrunc
Some details to work out:
- what hash algorithm? what block size during identical blocks detection? this directly influence how much disk space will be needed for storing hash lists
- how many incremental backups should we support? the format do not impose any limit, but the more layers, the harder restore process will be; I vote for just 1 (one)
- design restore workflow to be as simple as possible, and at the same time hard to screw up; especially: restore process need to handle full backup first, but only incremental one have reference on which full it is based; maybe retrieve incremental first, store it in some temporary location, then proceed to the full one, to apply incremental afterwards? or just retrieve (all selected) backup headers first, verify what to do and only then retrieve actual data?
Any thoughts? @Rudd-O @defuse @jpouellet
@marmarek I like the simplicity and elegance, but unfortunately there is a problem with that.
Blocks which have been zeroed, backed up, then restored, will not be faithfully reproduced. In the source of the incremental backup the block contains zeros, but in the restored image it would contain whatever it contained before being zeroed.
Consider this demonstration:
$ perl -e 'print "A"x(1024*4)' > A
$ perl -e 'print "B"x(1024*4)' > B
$ perl -e 'print "C"x(1024*4)' > C
$ perl -e 'print "\x00"x(1024*4)' > 0
$ hexdump -C A
00000000 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 |AAAAAAAAAAAAAAAA|
*
00001000
$ hexdump -C B
00000000 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 |BBBBBBBBBBBBBBBB|
*
00001000
$ hexdump -C C
00000000 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 |CCCCCCCCCCCCCCCC|
*
00001000
$ hexdump -C 0
00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001000
$ cat A 0 C > full
$ cat A B 0 > incremental
$ dd if=full of=restore
$ cat full > restore
$ cat incremental | dd of=restore conv=sparse,notrunc
24+0 records in
24+0 records out
12288 bytes (12 kB) copied, 8.6333e-05 s, 142 MB/s
$ hexdump -C full
00000000 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 |AAAAAAAAAAAAAAAA|
*
00001000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00002000 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 |CCCCCCCCCCCCCCCC|
*
00003000
$ hexdump -C incremental
00000000 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 |AAAAAAAAAAAAAAAA|
*
00001000 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 |BBBBBBBBBBBBBBBB|
*
00002000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00003000
$ hexdump -C restore
00000000 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 41 |AAAAAAAAAAAAAAAA|
*
00001000 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 42 |BBBBBBBBBBBBBBBB|
*
00002000 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 43 |CCCCCCCCCCCCCCCC|
*
00003000
In other words, if "unchanged" is encoded as "skip this" which is decoded as zeroes, and then all zeroes are interpreted as "skip this", then you have an unintended consequence of zeroed data in your source image being interpreted as "unchanged" rather than "been changed to zeroes". We would need a trustworthy way of disambiguating the two.
@jpouellet I see, but the problem is only only with simplified restore path - because dd cannot distinguish between zeroed block and block not included in the archive. But the information is there. In fact the only function of dd here is to avoid truncating when restoring to a file. It isn't needed when restoring to block device. So - probably better would be to use loop device instead of dd. Or find some tar option which I'm not aware of.
Mmm, maybe this trick would work:
tar xSOf private.img.tar.full > private.img
exec 3<>private.img
tar xSOf private.img.tar.incremental >&3
On 11/04/2016 10:04 AM, Marek Marczykowski-Górecki wrote:
Any thoughts? @Rudd-O https://github.com/Rudd-O @defuse https://github.com/defuse @jpouellet https://github.com/jpouellet
Your scheme sounds correct.
Unfortunately, I detest full+incremental backup series — I prefer an opaque encrypted block store that can be added to, and cleaned up, because it's more efficient and I do not need to diddle / fuck around with tracking full+incremental series. It's why I was looking at Attic, until I found out about Duplicati.
The implementation and algorithm suggested by Duplicati — with a minor tweak to add privacy by preventing deduplication between VMs — seems much better, more efficient, faster, and cryptographically sound.
I understand that, as the software is today, it does not fulfill the requirements of Qubes (restore with no extra tools), and I respect that.
Rudd-O
http://rudd-o.com/
On 11/05/2016 12:33 AM, Marek Marczykowski-Górecki wrote:
@jpouellet https://github.com/jpouellet I see, but the problem is only only with simplified restore path - because |dd| cannot distinguish between zeroed block and block not included in the archive. But the information is there. In fact the only function of |dd| here is to avoid truncating when restoring to /a file/. It isn't needed when restoring to block device. So - probably better would be to use loop device instead of |dd|. Of find some |tar| option which I'm not aware of. Mmm, maybe this trick would work:
|tar xSOf private.img.tar.full > private.img exec 3<>private.img tar xSOf private.img.tar.incremental >&3 |
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/QubesOS/qubes-issues/issues/858#issuecomment-258578577, or mute the thread https://github.com/notifications/unsubscribe-auth/AAVIwr5gUlMuJfc4I4MPElrW2yfll__Aks5q687IgaJpZM4DrUhq.
Honestly, it's become time to write a proper restore tool. The tool I wrote and posted on the other bug does precisely this — skips over the blocks that are zeroes. Unfortunately, it looks like that tool won't work for this purpose because you may need byte-level granularity?
Rudd-O
http://rudd-o.com/
The approach used by Apple's Time Machine has great merit, enjoying all the benefits being sought here:
- Quick handling of very large files
- Incremental
- Instant pruning/rotation of old backups
- Efficient with disk space on the live system
- No heavy data-processing during backup
It even allows mounting of encrypted volumes remotely.
Apple's approach is valuable to Qubes because they catered to their users' tendency to create very large files (usually media projects) often with small, frequent changes which rendered traditional incremental backup tools inefficient.
To get TM working with encryption, Apple had the user's home folder mounted from a 'sparsebundle'... a disk image that is actually a folder with a bunch of encrypted 2MB files. When a virtual disk block within any given 2MB 'band' is written, the band's backing file in the sparsebundle's folder would naturally have its mtime updated. --- That is the only 'special sauce' needed to implement backups that work like Time Machine.
The system only has to remember the time of the last backup and scan the sparsebundle dir by mtime to quickly find the changed data. For completeness, each session folder on the destination gets the unmodified bands from the prior backup in the form of hardlinks; This makes each backup increment whole and yet simple to manage (deleting a whole backup session won't delete a data band unless that folder was the only one referencing it).
The trick for Qubes would be to have the guest VM storage layer emulate the sparsebundle, with its band files--though examples do exist on Linux. If actual image files aren't involved, use a block driver capable of storing a bitmap to flag each virtual band as its modified.
Nearly all the other techniques involve processing the entirety of the source data during each backup. That is too CPU-, disk- and time-intensive to saddle on laptop users.
One decent alternative would be to utilize the more advanced Linux storage layers: Thin provisioned LVM and Btrfs can both quickly discern deltas between snapshots, resulting in efficient backup processing. Btrfs comes with btrfs-send which handles this function, and LVM has example code available. The downsides are that old backups cannot be sent in encrypted form and integrated/pruned on the destination unless the VM data is already encrypted as separate LUKS/dmcrypt containers; the live system also has to use extra space to hold the prior-backup's snapshot.
This is interesting idea, but not directly applicable to Qubes, without (IMHO too large) compromises on privacy and complexity.
-
Storing VM disk image as 2MB files (or such) seems technically hard - I don't know any sane Linux method to do that. Insane one: create loop device for each part and combine using device mapper. Or: check if qemu could do that (we don't want qemu in dom0).
-
This method implies more trust in backup storing facility. Mostly about privacy (backup observer can see what part of data have you changed and make some conclusions based on this), but also handling integrity would be more complex (need to assure that parts archived during very different backup runs are not manipulated, reordered etc). This is the price for easier cleanup of old backups. IMHO too high.
But indeed combining some smart method of listing modified blocks (instead of comparing hashes) with https://github.com/QubesOS/qubes-issues/issues/858#issuecomment-258388616 would be nice. Can you point exact info how to get modified block bitmap between two LVM thin snapshots or btrfs equivalent?
Additional think to learn from this - Apple have chosen 2MB chunks, I guess this is based on some research, so I'd reuse it.
@marmarek : Maybe 'insane' on Linux but not BSD... for some reason? ...don't know... :)
Here is Apple's open source code that makes it possible, in 480 lines:
http://www.opensource.apple.com/source/hfs/hfs-195/CopyHFSMeta/SparseBundle.c
That's it. Once you have that, you can do backups of VM image deltas just like 'rsnapshot' and several others would normally do with individual files on Linux. The destination just has to support hardlinking, and plenty do nowadays. Performance wise, it has been acceptable for millions of Mac users.
- I thought of the dm idea, but it isn't made for that. As Apple did, a block device is necessary... and not hard to do. Here are other implementations:
https://github.com/jeffmahoney/sparsebundle-loopback https://github.com/torarnv/sparsebundlefs https://github.com/vasi/rhfs
- There is probably no way around this: Either integrate pieces incrementaly and risk having that analyzed, or bundle the deltas like piles of wood logs making the restore process far longer and more precarious. Apple didn't think it was enough of an issue, and many secure systems backup small discrete files anyway, initiated from within the guest VMs.
I am also not against what you're saying on this point, and I think a "minimally incremental" process that needs to re-do a full backup periodically may ultimately be preferable. But there is a more manageable way to do it that would allow people to backup hourly without pain.
Also, the implementation does not have to be with file bundles... It can be 99.9% normal Linux block device using LVM or single img file, but with some extra handling of a bitmap. So instead of mtimes in a file bundle, you are checking for 'changed' bits in the bitmap.
If you don't go with the Time Machine approach, you should at least consider using LVM/Btrfs snapshots as the means of finding volume deltas. If I know my system needs to backup 200MB of changes, I do not want to see it wheezing at full throttle for 40 minutes to get there... I'd rather just have a slightly larger disk that can handle the snapshots and backup in less than 5 minutes.
@marmarek : Also, before going too much out on a limb... IIRC in the old bugtrack ticket Joanna said she thought the TM-style approach sounded good. That's how I remembered it.
I thought of the dm idea, but it isn't made for that. As Apple did, a block device is necessary... and not hard to do. Here are other implementations:
https://github.com/jeffmahoney/sparsebundle-loopback https://github.com/torarnv/sparsebundlefs https://github.com/vasi/rhfs
I wonder about performance of those. In most cases FUSE isn't good at it. We already have I/O performance reduced enough...
I am also not against what you're saying on this point, and I think a "minimally incremental" process that needs to re-do a full backup periodically may ultimately be preferable. But there is a more manageable way to do it that would allow people to backup hourly without pain.
Yes, this.
If you don't go with the Time Machine approach, you should at least consider using LVM/Btrfs snapshots as the means of finding volume deltas. If I know my system needs to backup 200MB of changes, I do not want to see it wheezing at full throttle for 40 minutes to get there... I'd rather just have a slightly larger disk that can handle the snapshots and backup in less than 5 minutes.
Yes, I'd like to (as we'll have snapshots in place in Qubes 4.0 anyway). This is why I've asked how to extract LVM/Btrfs volume delta (modified blocks bitmap or such).
Yes, I'd like to (as we'll have snapshots in place in Qubes 4.0 anyway). This is why I've asked how to extract LVM/Btrfs volume delta (modified blocks bitmap or such).
Have you seen this yet? https://github.com/jthornber/thinp-test-suite/blob/master/incremental_backup_example
For Btrfs, you can look at the btrfs-send source. You could also just use btrfs-send itself, as it outputs a stream.
Re: Block devices, it doesn't have to be FUSE. It can be done as a kernel driver, or even as a modification to the existing driver.
thin_delta seems to be exactly what is needed. A bit low level, but well...
Note to self - required operations:
# Note snapshot device ids:
lvdisplay -m qubes_dom0/VMNAME-private-snapshot1
lvdisplay -m qubes_dom0/VMNAME-private-snapshot2
# create "pool metadata snapshot" - required by thin_delta tool to access active pool
dmsetup message /dev/mapper/qubes_dom0/pool0-tpool 0 reserve_metadata_snap
# Compare snapshots - output is XML
thin_delta --thin1 <first snapshot id> --thin2 <second snapshot id> -m /dev/mapper/qubes_dom0-pool0_tmeta
# release "pool metadata snapshot"
dmsetup message /dev/mapper/qubes_dom0/pool0-tpool 0 release_metadata_snap
The downside of having full snapshot vs just block hashes is of course higher disk usage. Some may be willing to pay this price, some not - I guess it will depends on actual additional space needed and full backup frequency for particular user.
On 11/22/2016 10:30 AM, tasket wrote:
The approach used by Apple's Time Machine has great merit, enjoying all the benefits being sought here:
- Quick handling of very large files
YES!
Nearly all the other techniques involve /processing the entirety of the source data during each backup/. That is too CPU-, disk- and time-intensive to saddle on laptop users.
Indeed. This is why incremental backups using duplicity (the code of which I wrote) take FOREVER.
I would prefer something like a btrfs delta snapshot being backed up, because that's even faster than what Mac OS X does. But it doesn't seem to be the case that anyone at ITL wants to implement that.
Rudd-O
http://rudd-o.com/