ckanext-datapackager icon indicating copy to clipboard operation
ckanext-datapackager copied to clipboard

Provide more guidance when importing Data Packages

Open amercader opened this issue 9 years ago • 5 comments

In general, the process of importing Data Packages is confusing and prone to errors, specially if you are not familiar with Data Packages. Apart from the obvious bugs (#37, #39) and the confusing issues (#38) I'd say that most people will struggle to import their Data Packages on the first time, unless:

  • They upload/link to a datapackage.json with url on their resources
  • They upload/link to a zip file including a datapackage.json with path on their resources and the data files.

If these are the two situations that we can support fair enough, but maybe we should provide a helper text explaining what users should upload.

amercader avatar Jan 08 '16 11:01 amercader

I commented on https://github.com/ckan/ckanext-datapackager/issues/38 about the first issue of a datapackage.json without url on the resources. The issue wasn't that the resources didn't have a URL, but that they only had a path that didn't exist.

As the code is now, I would say that the users will be able to import a good data package without issues (https://github.com/ckan/ckanext-datapackager/issues/39 was fixed). The users will struggle if the data packages aren't valid, or any of its resources' data aren't available, as the error messages are somewhat confusing.

The current "happy path" is:

  • datapackage.json with all resources' data remotely accessible (no inline or local resources);
  • ZIP file with datapackage.json at the root level;

That seems sensible for a first version IMO. We might also want to add support for ZIP files with the datapackages inside a single folder (https://github.com/okfn/datapackage-py/issues/29), like the GitHub exports, but it's not essential.

vitorbaptista avatar Jan 13 '16 19:01 vitorbaptista

It would be useful to think about some common use cases with invalid datapackages (invalid for importing on CKAN), so we can make sure we handle them well.

I can think of:

  • datapackage.json with resources with only local data;
  • datapackage.json with resources with only inline data;
  • datapackage.json with resources without data;
  • Invalid datapackage (e.g. without a "name" attribute);
  • Invalid datapackage JSON file;
  • ZIP file without datapackage.json in the root folder.

What else?

vitorbaptista avatar Jan 13 '16 19:01 vitorbaptista

@vitorbaptista

cc @amercader

Can you please distill this into a task list that could be actionable?

pwalsh avatar May 30 '16 09:05 pwalsh

Suggest the action is to improve the readme importing section to describe what are valid and invalid data packages to upload.

This could be complemented with valid data package examples.

Stephen-Gates avatar Nov 29 '17 06:11 Stephen-Gates

Implementation

  • [ ] Extend the Import section in the README with a description of the formats supported
  • [x] Add a helper text in the form that suggests what to upload / link to (zipped data package or link to datapackage.json and what is supported (Done in c2bd8a6)

Estimate

0.5 days

amercader avatar Feb 09 '18 09:02 amercader