[idea] smart replacement of sources in the multiple backup target case
This is somewhat related to #436, but takes a different approach. There is no parallelism involved, and I think it's less complex.
My scenario is this:
From my backup server, btrbk is pulling snapshots (backups) from remote servers. Each such remote snapshot is backed up to 2 different physical drives on my backup server. So if one drive fails, I still have the backups on the other. If all goes well, I have identical backups on the two drives. This setup typically produces this kind of console output:
remote-server.example.com:/mnt/btr_pool/volumes/volume-photo
>>> /mnt/disk1/snapshots/volume-photo.20251206T1100
>>> /mnt/disk2/snapshots/volume-photo.20251206T1100
The issue is that the network between my remote server(s) and the backup server is limited in bandwidth, it's certainly much smaller than I/O between the several local disks of the backup server. Nevertheless btrbk is pulling each snapshot twice, making the entire process inefficient. If I've got a large diff (e.g. after a postgres "full vacuum" on the volume to backup) one run can take several hours and therefore overrun the time slot I've allocated.
(I know I could use some variant of the "archive" command to copy from disk1 to disk2. But since that command doesn't have its own step in "run" – e.g. after "Delete Snapshots" – it would need its own configuration and invocation. This results in concurrency/locking problems between the "btrbk run" and "btrbk archive" running in parallel.)
This is my idea:
For the above scenario, btrbk could be smart and know that after pulling the snapshot from remote-server to disk1, rather than pulling it from remote-server again for disk2 it could pull it from disk1 much quicker. It would know that both disk1 and disk2 are local targets. So btrbk could transparently replace the source, just for this operation, without any need for special configuration by the user.
Of course, replacing the source is only possible if the first backup remote-server→disk1 didn't fail, and if the matching parent exists on both disks. If these preconditions aren't met, btrbk wouldn't replace the source and continue as normal, as it would do in the current version.
As a stretch goal, the concept could be extended to targets reachable via high-bandwidth LAN. Users could globally configure targets (or just hostnames) which would join the default group of local targets. This would accomodate cases of multiple (somewhat separated) backup servers.