specs icon indicating copy to clipboard operation
specs copied to clipboard

Don't require Tabular Data Package to have `profile = tabular-data-package`

Open peterdesmet opened this issue 3 years ago • 5 comments

Thanks to https://github.com/frictionlessdata/frictionless-py/issues/618, it is now possible to define a custom (community) data package profile that extends data-package. Like so (e.g. https://raw.githubusercontent.com/tdwg/camtrap-dp/main/camtrap-dp-profile.json):

allOf": [
{
  "$ref": "https://frictionlessdata.io/schemas/data-package.json"
},
{
  "required": [
     ..

A data package using to this profile would have:

"profile": "https://raw.githubusercontent.com/tdwg/camtrap-dp/main/camtrap-dp-profile.json"

And would validate against everything in data-package as well as in camtrap-dp-profile.

If the custom profile requires the data package to be a tabular data package as well, you could extend as such:

allOf": [
{
  "$ref": "https://frictionlessdata.io/schemas/tabular-data-package.json"
},
{
  "required": [
     ..

But that introduces a validation error:

package-error  The data package has an error: "'https://raw.githubusercontent.com/tdwg/camtrap-dp/main/camtrap-dp-profile.json' is not one of ['tabular-data-package']" at "profile" in metadata and at "allOf/0/properties/profile/enum" in profile      

That is because the tabular data package specs requires profile to be tabular-data-package, closing it off for any expansion. Is this by design?

peterdesmet avatar Jun 15 '21 13:06 peterdesmet

@peterdesmet I had to fix it for Fiscal Data Package shipped with Frictionless. I think it needs to be fixed in the specs too

roll avatar Jun 15 '21 16:06 roll

I think it needs to be fixed in the specs too

As in: don't make it required that profile is tabular-data-package?

peterdesmet avatar Jun 15 '21 17:06 peterdesmet

@peterdesmet Yes

roll avatar Jun 15 '21 17:06 roll

This issue was brought up again in https://github.com/frictionlessdata/frictionless-r/issues/186.

I think Tabular Data Package as a spec should be deprecated. It can remain as a guide, but should not have a spec imo. The current spec has 3 requirements:

There MUST be at least one resource in the resources array

This is already a requirement in Data Package: https://specs.frictionlessdata.io/data-package/#required-properties

There MUST be a profile property with the value tabular-data-package

This doesn't add much, and as described above, disables building on top of tabular-data-package. We should focus on Tabular Data Resource instead.

Each resource MUST be a Tabular Data Resource #

I'm not sure why it is an advantage that each resource must be tabular data resource.

peterdesmet avatar Mar 20 '24 09:03 peterdesmet

@peterdesmet Peter, I totally agree. Since Tabular Data Resource introduction, Tabular Data Package basically lost all the added value it had as initially it had been created to guarantee resource to be tables and it's on the resource level now

Of course, there is no reason to require ALL the resources to be tables as it is a good practice to include some documentation and other materials e.g. a license

roll avatar Mar 27 '24 15:03 roll