specification icon indicating copy to clipboard operation
specification copied to clipboard

Example datasets?

Open pmackay opened this issue 9 years ago • 13 comments

Are there any example datasets (that match the 1.0 spec!) available on the web? Would be great to have those linked from http://openreferral.org/.

pmackay avatar Mar 17 '15 15:03 pmackay

Just wanted to add that I came here looking for an example and can't find one. I would also like to see an example somewhere prominent in the documentation.

bengolder avatar Jun 18 '15 23:06 bengolder

+1 for examples being useful.

fureigh avatar Jun 19 '15 18:06 fureigh

+1 for examples as it would be great for testing the validation tool that I'm creating (http://niveditc.github.io/open-referral-validation/).

niveditc avatar Jun 19 '15 19:06 niveditc

I'm sorry it's taken so long to address this. Previous versions of the spec were posted with sample datapackages, and I'd assumed that would have been the case with v1.0, but somehow it didn't happen. I'm asking around to see which of the pilots has a good sample that they can share. Thanks for your patience.

greggish avatar Jun 20 '15 00:06 greggish

Just reflecting that there seems to be two levels of need here:

  1. Demonstrative, detailed data sets as part of the user documentation (aimed at existing users, for testing, or for those looking deeply into the spec).
  2. Small, simplified examples that show the basics of OpenReferral at a glance, aimed at curious passersby (such as myself).

bengolder avatar Jun 21 '15 23:06 bengolder

Example CSV files are available in the Ohana API repo here: https://github.com/codeforamerica/ohana-api/tree/master/data/sample-csv

The Ohana API Wiki also has instructions for creating the CSV files, including all the columns in each file, whether or not it is required, and a description of the field, including any special formatting. I think the documentation is easier to read when presented in this fashion, and I would recommend that the documentation in this repo be updated to match.

monfresh avatar Jun 22 '15 03:06 monfresh

@monfresh, thank you! This will be very helpful.

niveditc avatar Jun 22 '15 03:06 niveditc

Thanks @monfresh. I know there has been work done on upgraded documentation for the spec, so I'll share your suggestion.

Just to clarify: among those CSVs, I believe we still don't have an example JSON datapackage to go with it. We have this datapackage.json template here but until we have a sample datapackage that corresponds with sample CSVs, this issue should remain open.

greggish avatar Jun 22 '15 15:06 greggish

Note that the sole purpose of the datapackage.json is to provide documentation about the CSV files, but since the spec is already documented, I personally consider the datapackage.json to be redundant. The presence of datapackage.json is in no way necessary in order to import the data into any system.

The GTFS, for example, has excellent online documentation, and does not make use of a datapackage.json.

If the set of CSV files were to be shared with someone who was not familiar with the spec, I think including a text file with a link to the spec documentation would suffice.

monfresh avatar Jun 22 '15 15:06 monfresh

At first glance I was also wary of the addition of datapackage.json to our spec, as it appears to another layer of technical complexity that may fall outside of the grasp of many non-technical users who might produce or import HSDS data.

However, as I understand it, this is an emerging protocol for publishing structured datasets on the web, one that could remove some of the friction posed by the complexity of our spec. GTFS was designed before the creation of the datapackage protocol, and I've heard some suggest in retrospect that GTFS may have benefited from it. Furthermore, our spec is at this point significantly more elaborate than GTFS.

So, we should welcome opportunities to test these assumptions: More about datapackages here: http://dataprotocols.org/data-packages/ And a tool for creating them here: http://ckan.org/2014/06/09/the-open-knowledge-data-packager/

cc @monfresh

greggish avatar Jun 23 '15 16:06 greggish

We have been working on an example dataset to put out with the further release of 1.0 tidied up docs.

We are now working with datapackage.json as our definition for the HSDS, which means we can generate nice views of packages like the example here and can get the benefits of data package tooling in future.

timgdavies avatar Dec 13 '16 20:12 timgdavies

Moving to 1.1

timgdavies avatar Jan 31 '17 22:01 timgdavies

I'd be willing to provide an example in the future but first want to incorporate all the 1.0 udpates Tim recently published to ensure it is compliant. I have provided a dataset privately to @timgdavies and am expecting the good sir will have some such feedback for me soon.

NeilMcKLogic avatar Feb 20 '17 20:02 NeilMcKLogic

I'm closing this since as of 3.0 we have a dedicated examples/ folder on the repo, and examples are rendered for each object on the schema reference page in the docs. This isn't quite a complete example dataset, but I believe this addresses similar concerns and is more appropriate and manageable for the current docs.

mrshll1001 avatar Nov 21 '23 14:11 mrshll1001