DEPs icon indicating copy to clipboard operation
DEPs copied to clipboard

For discussion: full-file hashes in hyperdrive metadata

Open bnewbold opened this issue 7 years ago • 4 comments

Rendered pre-merge

"Full-file hashes are optionally included in hyperdrive metadata to complement the existing cryptographic-strength hashing of sub-file chunks. Multiple popular hash algorithms can be included at the same time."

Previous discussion:

  • https://github.com/mafintosh/hyperdrive/issues/203
  • https://github.com/datproject/discussions/issues/77#issuecomment-354969691

Before seriously reviewing to merge as Draft, I would want to demonstrate working code and have a better idea what the API would look like, but this is otherwise pretty flushed out.

cc: @martinheidegger

bnewbold avatar Mar 18 '18 05:03 bnewbold

Other advantages:

  • Hashes can be used as pointers on the network which imply happens-before causality. That is: If a record can point to the hash of the file, and the record has not been changed since it was created, then we know that the pointed-to file existed before the record did.
  • The discovery & wire network can be expanded in the future to fetch individual files using these hashes.

pfrazee avatar Mar 18 '18 18:03 pfrazee

I feel like there are various recommendations missing from the dat project

This was intentional. I don't think we should be over prescriptive; folks should be able to adapt this feature to their own needs (including ones we aren't even thinking of at this time).

For the sake of simplicity, if users or implementors don't want to be bothered choosing or coordinating algorithms to use by default, this draft does say:

For 2018, recommended default full-file hash functions to include are SHA1 (for popularity and interoperability) and blake2b-256 (already used in other parts of the Dat protocol stack).

bnewbold avatar Mar 20 '18 18:03 bnewbold

I seemed to have overread that section :blush:

For 2018, recommended default full-file hash functions to include are SHA1 (for popularity and interoperability) and blake2b-256 (already used in other parts of the Dat protocol stack).

Now I am all good - though @mafintosh might have something to say about additionally having to compute SHA1.

https://github.com/mafintosh/hyperdrive/issues/203#issuecomment-367472911

martinheidegger avatar Mar 20 '18 21:03 martinheidegger

Maybe "widely used"

On March 20, 2018 2:29:29 PM PDT, Martin Heidegger [email protected] wrote:

martinheidegger commented on this pull request.

+Type: Standard + +Status: Undefined (as of YYYY-MM-DD) + +Github PR: Discussion + +Authors: Bryan Newbold + + +# Summary +[summary]: #summary + +Full-file hashes are optionally included in hyperdrive metadata to complement +the existing cryptographic-strength hashing of sub-file chunks. Multiple +popular hash algorithms can be included at the same time.

This was a little nitpicking around my impression that the sentence would have the same meaning and impact without "popular"; this blew out of proportion.

-- You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub: https://github.com/datprotocol/DEPs/pull/12#discussion_r175928843

bnewbold avatar Mar 20 '18 21:03 bnewbold