borg restore big files/block devices with considering already present data

I use borg to make backups of my LVM volumes. It works fine at backup creation time but not as good when I do 'borg extract'. It pulls many gigabytes of data from remote borg repo server although I still have an almost unchanged device in place.

It would be nice to have an option to not just extract data from a borg repo but to also check against an existing file/block device, if it already has the required data and only retrieve changed data from the remote repo.

Jan 20 '21 13:01 Lanozavr

Yeah, that would be good.

Notable:

when writing to a block device, one likely would use borg extract --stdout ... > /dev/blkxxx to directly extract to the block device and when doing that, borg does not even know where the data is going / what it could compare against
for similar reasons borg create --stdin ... might be used at backup time - in this case we might not have the blockdevice name inside the archive
this is the simpler part of a more complex "missing feature": that of bringing a local data set into an archived state (here, simple: 1 file == a block device, more complex case: a filesystem with existing directories and existing files)
borg create already has --read-special to indicate that special files should actually have their content read, not just the device file archived. in that case, we have blockdevice names (relative though) in the archive, but at extraction time, desired device names could be different

Jan 20 '21 13:01 ThomasWaldmann

If A is archived data stream and F is the fs data stream:

iterate over A chunks list to get chunkid_A, size
read size bytes from F, compute chunkid_F
if chunkid_A == chunkid_F, we already have the correct data, next...
if not, read chunk from A, write to F (seek to correct pos first), next...
at the end of A, truncate F to the same size.

Jan 20 '21 13:01 ThomasWaldmann