frictionless-r icon indicating copy to clipboard operation
frictionless-r copied to clipboard

How to convert a v1 to a v2 package

Open peterdesmet opened this issue 1 year ago • 3 comments
trafficstars

I wanted to assess the complexity of converting a v1 to a v2 Data Package. Below are the steps that need to be taken. For version detection, see #262. @khusmann could you review these? There are a couple of items I'm unsure about.

Package

Add package.$schema, remove package.profile

Use package.profile, then remove it.

  • [ ] NULL => https://datapackage.org/profiles/2.0/datapackage.json
  • [ ] data-package (registered id) => https://datapackage.org/profiles/2.0/datapackage.json
  • [ ] tabular-data-package (registered id) => https://datapackage.org/profiles/2.0/datapackage.json. This also removes deprecated tabular-data-package
  • [ ] fiscal-data-package (registered id) => Unsure, should we use the 1.0 URL for fiscal-data-package?
  • [ ] A URL => Unsure, the referenced schema will likely point to Data Package v1, making it a v1
  • [ ] Any other value => Unsure, not allowed by https://specs.frictionlessdata.io/profiles/

Add package.contributors.roles

  • [ ] For each contributor set roles (array) based on role (string). Remove role

Other changes

Each resource

Add resource.$schema, remove resource.profile

Use resource.profile, then remove it

  • [ ] NULL =>https://datapackage.org/profiles/2.0/dataresource.json
  • [ ] data-resource (registered id) => https://datapackage.org/profiles/2.0/dataresource.json
  • [ ] tabular-data-resource (registered id) => https://datapackage.org/profiles/2.0/dataresource.json (but see resource.type)
  • [ ] A URL => Unsure, the referenced schema will likely point to Data Package v1, making it a v1
  • [ ] Any other value => Unsure, not allowed by https://specs.frictionlessdata.io/profiles/
  • [ ] There is also the edge case where $schema is already present (i.e. a v1 package with a v2 resource). => Unsure, should the present resource.$schema be left as is then?

Add resource.type

Use resource.profile:

  • [ ] NULL => don't set
  • [ ] tabular-data-resource => table
  • [ ] Any other value or URL => don't set

Other changes

  • [x] resource.sources: no change required
  • [x] resource.name: rules are relaxed, existing names can remain as is
  • [x] resource.path: dot-paths are now forbidden. In the edge case there is such a path provided, we should not convert it, because it is impossible to know what would be the correct path. These types of paths will be flagged when reading a resource.
  • [x] resource.encoding: allows more, no action required

For each dialect

Note that upconverting a dialect requires a remote one to be downloaded and verbosely included.

Add dialect.$schema

  • [ ] dialect.caseSensitiveHeader is present => https://datapackage.org/profiles/1.0/tabledialect.json
  • [ ] dialect.csvddfVersion is present => https://datapackage.org/profiles/1.0/tabledialect.json
  • [ ] Otherwise this can safely be set to https://datapackage.org/profiles/2.0/tabledialect.json

Unsure about this though. For example, if a dialect was absent (very often the case), one will be added with just the $schema property. The alternative is to leave all dialects as v1 (assuming a $schema that defaults to https://datapackage.org/profiles/1.0/tabledialect.json). That would also mean that remote dialects can stay remote.

Other changes

For each schema

Note that upconverting a schema requires a remote one to be downloaded and verbosely included.

Add schema.$schema

  • [ ] Set to https://datapackage.org/profiles/2.0/tableschema.json because we will update the schema it to that version.

Update schema.primaryKey

  • [ ] Convert from string to array.

Update schema.foreignKeys

  • [ ] Convert schema.foreignKeys.fields from string to array
  • [ ] Convert schema.foreignKeys.reference[x].fields from string to array
  • [ ] If schema.foreignKeys.reference[x].resource = resource name => remove property

No action required

For each field

Other changes

peterdesmet avatar Aug 30 '24 08:08 peterdesmet