specs icon indicating copy to clipboard operation
specs copied to clipboard

hash should be an object

Open pwalsh opened this issue 7 years ago • 10 comments

We currently support a special prefixing on the hash to declare hashing algorithms the are not MD5. However, we are moving away from this type of overloading in most all places in the spec.

So, I suggest:

"hash": {
  "type": "md5",
  "value": "{HASH}"
}

pwalsh avatar Feb 27 '17 13:02 pwalsh

@pwalsh agreed -- certainly for the "rigorous" version of the spec.

rufuspollock avatar Mar 08 '17 06:03 rufuspollock

@pwalsh do we want to do this in v1.0?

rufuspollock avatar May 24 '17 00:05 rufuspollock

@roll @akariv do you want this for v1.0, or is v1.1 ok?

pwalsh avatar May 29 '17 07:05 pwalsh

@pwalsh It's breaking change. v1.0 vs v1.1 depends on breaking changes policy for the specs.

PS. But I suppose no implementations uses it for now so it's more conceptual than practical question.

roll avatar May 29 '17 08:05 roll

I have to say i also like the simplicity of a single value - no sub-object. Do we know what other systems do here e.g. debs etc?

rufuspollock avatar May 30 '17 14:05 rufuspollock

@rufuspollock and @roll did this ever make to (some version) of the specs?

I've noticed that if I read a data package with frictionless.py with a string value in the hash property it will convert it to an object and add a hashing property.

That is, this datapackage.json

{
  "profile": "data-package",
  "resources": [
    {
      "profile": "data-resource",
      "name": "estados",
      "path": "estados.csv",
      "hash": "sha256:c280dab2e21da93be52aef5a4c934abdd4d70d9981f59372e3f36f4ca8b1ac38"
    }
  ],
  "name": "datapackage-reprex"
}

After

from frictionless import Package

dp = Package('datapackage.json')

dp.to_json('datapackage.json')

is serialized as

{
  "profile": "data-package",
  "resources": [
    {
      "profile": "data-resource",
      "name": "estados",
      "path": "estados.csv",
      "hashing": "sha256",
      "stats": {
        "hash": "c280dab2e21da93be52aef5a4c934abdd4d70d9981f59372e3f36f4ca8b1ac38"
      }
    }
  ],
  "name": "datapackage-reprex"
}

I'm starting to use this in a production context and it would be nice to know the recommended approach moving forward.

fjuniorr avatar Sep 10 '21 19:09 fjuniorr

@fjuniorr i don't believe this ever made it into the spec so believe we are still with single string value with optional prefix.

rufuspollock avatar Sep 13 '21 13:09 rufuspollock

Hi @rufuspollock can you share what is the plan to release v1.1 or v2 that covers stats?

rgaiacs avatar Aug 12 '22 00:08 rgaiacs

You might wish to consider multi-hash encoding https://github.com/multiformats/multihash/blob/master/README.md

It is used in IPFS for cryptographic content identifiers https://richardschneider.github.io/net-ipfs-core/articles/multihash.html

On Thu, Aug 11, 2022 at 8:00 PM Raniere Silva @.***> wrote:

Hi @rufuspollock https://github.com/rufuspollock can you share what is the plan to release v1.1 or v2 that covers stats?

— Reply to this email directly, view it on GitHub https://github.com/frictionlessdata/specs/issues/379#issuecomment-1212603017, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFB5VN6VH2ONEOAKU6DZGLVYWHZ5ANCNFSM4DBUM6XA . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Regards, Randy

rjgladish avatar Aug 12 '22 00:08 rjgladish

@rgaiacs stats stuff is #364 i think. Do you want to comment there (especially on what you'd like to see)

rufuspollock avatar Aug 23 '22 10:08 rufuspollock

Due to historical reasons, I propose to close this issue as wontfix.

For both publishers and implementors, there is really no difference whether this property is a string or an object, as it's very easy to use and implement any of them. I think there is no additional value to having changes here as for v2 and following, we strictly try to avoid breaking changes.

In general, what I think can really bring some additional value if we allow multiple hashes but it can be done just via a new non-breaking property like resource.hashes.[md5/sha256/etc] (although it really needs to be justified first)

roll avatar Jan 03 '24 15:01 roll

CLOSED as wontfix

roll avatar Jan 25 '24 08:01 roll