borg
borg copied to clipboard
restore big files/block devices with considering already present data
I use borg to make backups of my LVM volumes. It works fine at backup creation time but not as good when I do 'borg extract'. It pulls many gigabytes of data from remote borg repo server although I still have an almost unchanged device in place.
It would be nice to have an option to not just extract data from a borg repo but to also check against an existing file/block device, if it already has the required data and only retrieve changed data from the remote repo.
Yeah, that would be good.
Notable:
- when writing to a block device, one likely would use
borg extract --stdout ... > /dev/blkxxxto directly extract to the block device and when doing that, borg does not even know where the data is going / what it could compare against - for similar reasons
borg create --stdin ...might be used at backup time - in this case we might not have the blockdevice name inside the archive - this is the simpler part of a more complex "missing feature": that of bringing a local data set into an archived state (here, simple: 1 file == a block device, more complex case: a filesystem with existing directories and existing files)
- borg create already has
--read-specialto indicate that special files should actually have their content read, not just the device file archived. in that case, we have blockdevice names (relative though) in the archive, but at extraction time, desired device names could be different
If A is archived data stream and F is the fs data stream:
- iterate over A chunks list to get chunkid_A, size
- read size bytes from F, compute chunkid_F
- if chunkid_A == chunkid_F, we already have the correct data, next...
- if not, read
chunk from A, write to F (seek to correct pos first), next... - at the end of A, truncate F to the same size.