zfsbackup-go
zfsbackup-go copied to clipboard
[Feature Request] Include PAR as an option
How does it handle the backup files being corrupted? Maybe include par as an option.
The original intention for this application is for the files to reside on storage targets that are not prone to such issues, such as those offered by Google, Amazon, Backblaze, Azure, etc.
I did look into adding this feature (specifically, Reed-Solomon erasure encoding) but reasoned it was not required for this project.
If there is interest in using targets that don't have such file resiliency guarantees where adding parity bits to each file makes sense, I'd be happy to add this as an option in the future.
I would definitely suggest something along these lines. "file resiliency guarantees" don't help you when the file you are working with is corrupt somehow.
But then, I'm a backup nut. I use ZFS send/recv, Backuppc, Borg, and Proxmox all on the same data.
So S3, Google, Azure and Backblaze guarantee data integrity? -- I'm looking at using this tool/project to backup my ZFS pool to Backblaze
Yes, almost all intended targets come with some SLA on the durability of the data you store on there, and since zfsbackup-go supports multiple targets, you can increase your durability by utilizing multiple providers. Note: I am still working on adding Azure/Backblaze targets, should hit within a week or two.
Google: 99.999999999% durability - they mention the usage of erasure encodings AWS S3: 99.999999999% durability - they mention the use of checksums on the data for integrity validation and repair (something that sounds similar to what ZFS does though I'm sure its more distributed and complicated than that) Azure: Although no target is provided, they give an in-depth explanation of their architecture which you can read here - they use the Reed-Solomon erasure encoding and you can increase durablitiy by increasing your redundancy options Backblaze: 99.999999% - they also utilize the Reed-Solomon erasure encoding and even open sourced their code of it
I also use the checksum features of the targets available to ensure that data is delivered properly when storing the data (e.g. CRC32C for Google, MD5 for S3, etc.) This is all computed as the ZFS send stream is chunked and made ready for uploading.
Nice! Thanks for the writeup! We should put this in the README for reference?