B2_Command_Line_Tool
B2_Command_Line_Tool copied to clipboard
Set large_file_sha1 for large files.
The B2 docs recommend the file info large_file_sha1
for storing the checksum of the entire contents of a large file. It would be nice for the command-line tool to know how to set it. The drawback to setting it is that it requires reading the entire file to compute the sha1 before starting to upload it.
The question is: Should setting it be the default (with an option to turn it off), or should setting it require an explicit command-line option?
What benefit (other than following the documentation) for the end user would that achieve?
IMO hashes of parts are sufficient and an overall hash is not worth the additional read or client-side implementation of it.
Enabling this by default could severely impact some of our users by consuming additional resources and failing to fit in the set backup window. I think we should keep the compatibility in this case.
I agree with @ppolewicz. There is really not much you can do with these hashes, since you can't rely on them being available (or even correct). But I think it would be useful to have an optional flag for this, if the user needs the hash for a specific scenario.
Also, it would be easier on the client if the hash can be specified after uploading. For large file hashes they could be supplied to the b2_finish_large_file
function.
I agree that this should be an option you can enable, not the default.
@svonohr: Eventually, we want to be able to supply this checksum at the end. Even that, thought, will require processing the parts in sequence, not in parallel, to compute the checksum.
If --threads 1
is used or if the sync planner is smart, providing the hash at the end can avoid the additional read.