specs
specs copied to clipboard
hash should be an object
We currently support a special prefixing on the hash
to declare hashing algorithms the are not MD5. However, we are moving away from this type of overloading in most all places in the spec.
So, I suggest:
"hash": {
"type": "md5",
"value": "{HASH}"
}
@pwalsh agreed -- certainly for the "rigorous" version of the spec.
@pwalsh do we want to do this in v1.0?
@roll @akariv do you want this for v1.0, or is v1.1 ok?
@pwalsh It's breaking change. v1.0 vs v1.1 depends on breaking changes policy for the specs.
PS. But I suppose no implementations uses it for now so it's more conceptual than practical question.
I have to say i also like the simplicity of a single value - no sub-object. Do we know what other systems do here e.g. debs etc?
@rufuspollock and @roll did this ever make to (some version) of the specs?
I've noticed that if I read a data package with frictionless.py
with a string value in the hash
property it will convert it to an object and add a hashing
property.
That is, this datapackage.json
{
"profile": "data-package",
"resources": [
{
"profile": "data-resource",
"name": "estados",
"path": "estados.csv",
"hash": "sha256:c280dab2e21da93be52aef5a4c934abdd4d70d9981f59372e3f36f4ca8b1ac38"
}
],
"name": "datapackage-reprex"
}
After
from frictionless import Package
dp = Package('datapackage.json')
dp.to_json('datapackage.json')
is serialized as
{
"profile": "data-package",
"resources": [
{
"profile": "data-resource",
"name": "estados",
"path": "estados.csv",
"hashing": "sha256",
"stats": {
"hash": "c280dab2e21da93be52aef5a4c934abdd4d70d9981f59372e3f36f4ca8b1ac38"
}
}
],
"name": "datapackage-reprex"
}
I'm starting to use this in a production context and it would be nice to know the recommended approach moving forward.
@fjuniorr i don't believe this ever made it into the spec so believe we are still with single string value with optional prefix.
Hi @rufuspollock can you share what is the plan to release v1.1 or v2 that covers stats
?
You might wish to consider multi-hash encoding https://github.com/multiformats/multihash/blob/master/README.md
It is used in IPFS for cryptographic content identifiers https://richardschneider.github.io/net-ipfs-core/articles/multihash.html
On Thu, Aug 11, 2022 at 8:00 PM Raniere Silva @.***> wrote:
Hi @rufuspollock https://github.com/rufuspollock can you share what is the plan to release v1.1 or v2 that covers stats?
— Reply to this email directly, view it on GitHub https://github.com/frictionlessdata/specs/issues/379#issuecomment-1212603017, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFB5VN6VH2ONEOAKU6DZGLVYWHZ5ANCNFSM4DBUM6XA . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Regards, Randy
@rgaiacs stats stuff is #364 i think. Do you want to comment there (especially on what you'd like to see)
Due to historical reasons, I propose to close this issue as wontfix
.
For both publishers and implementors, there is really no difference whether this property is a string or an object, as it's very easy to use and implement any of them. I think there is no additional value to having changes here as for v2 and following, we strictly try to avoid breaking changes.
In general, what I think can really bring some additional value if we allow multiple hashes but it can be done just via a new non-breaking property like resource.hashes.[md5/sha256/etc]
(although it really needs to be justified first)
CLOSED as wontfix