backintime
backintime copied to clipboard
Idea: Improve efficiency to destinations using ZFS (or similar)
Describe the problem, feature or ask a question:
When creating a destination on a filesystem that supports reflinks, the initial copy ("cp -aRl") could be set to use that mechanism, instead of hardlinks and then the rsync command could use the inplace switch to get the most benefit from CoW and only occupy space for the changed parts of the file.
Hopefully this would be a minimal change as most of the burden lands on the filesystems in question.
Hello Alvamiga, thank you for your idea. I have to say I don't have enough expert knowledge about filesystems to even understand your idea. Can you give more details please. ;)
The following terms are unclear to me
- reflinks
- CoW
- ZFS (I know it is a filesystem, of course.)
Hello Alvamiga, can you provide some more details and give me feedback about the topic?
Regards, Christian
Sorry for the delay in getting back to you.
Yes, ZFS is a filesystem but, like several other new ones, it has additional features, such as the two I have mentioned.
When you reflink a file, it initially appears in the same way as a hard-link, sharing the same data segments as the original. The critical difference is that when you write to either file it creates a copy (Copy on Write-CoW) but still using the same data blocks as the original, again like a hard-link.
Writing to either file creates new blocks, as required, but any unchanged sections of the file occupy the same segments on disk, so you could have file 1:ABCDEF and file 2:ABCGEH (between them using ABCDEFGH, only 8 segments instead of 12 that 2 copies of the file would require).
If you did modify the entire file, the two files would eventually just look like two entirely separate files, but until then they occupy less space on the drive.
My suggestion means that if you were to back up a 20GB ISO, for example, modifying 100MB of it would allow you to still have both versions backed up, only taking 20.1GB of space, instead of the 40GB hard-links would require (because any change creates an entirely new version in full).
-- What I am suggesting would be: You would use "cp --reflink" to create the new tree for backup (which now uses --link). You then can use --inplace with the rsync command. (There's actually a reference to doing this on CoW systems in the rsync man page under the "--inplace" section, for the reasons I have explained).
You would have to make it an option for the user to explicitly enable, as doing it on a non-CoW system could have unwanted effects, such as modifying a hard-linked copy, updating it back through the history, or backups just being full copies, not hard-linked at all. (It would be relatively simple to make an automatic test).
Hope this clarifies my idea. Let me know if this helps or there's anything that's still unclear.
Hello Alvamiga, thank your for explaining. Now I get it.
It is a big complex feature. But sounds nice to time. Currently BIT seems a bit to unstable, smelly and untested for me to tackle something like this. But we are still improving it. So it won't happen early but I'll keep it open.
See our Strategy Outline please, to get an idea about the broader issues we have to tackle and to modernize BIT's codebase.
Regards, Christian