country-list icon indicating copy to clipboard operation
country-list copied to clipboard

Include JSON in repository?

Open jochakovsky opened this issue 8 years ago • 14 comments

http://data.okfn.org/data/core/country-list makes both CSV and JSON formats available for download, but only the CSV is directly available in this repository. Would it be possible to include the JSON in this repository as well? Thank you!

jochakovsky avatar May 26 '16 14:05 jochakovsky

@jochakovsky what's the exact use-case?

In general, we want to keep this is a clean tabular data package which means CSV only.

However, we also want to support user needs so good to know the requirement :-)

rufuspollock avatar May 29 '16 06:05 rufuspollock

A popular package such as https://www.npmjs.com/package/country-list could really do with getting its dependancies from here.

Glutnix avatar Oct 13 '16 04:10 Glutnix

Currently the npm package country-list is getting the data from here, but it needs to convert it from cvs to json.

fannarsh avatar Oct 13 '16 11:10 fannarsh

@fannarsh really useful data point. How are you getting this data package? Are you submoduling it, puling it direct from raw or getting it from data.okfn.org/data/country-list? I note the latter already has a JSON version via the API but it sounds like having json would be useful.

Let me know a bit more about what you'd like and let's see if we can get it working for you 😄

rufuspollock avatar Oct 14 '16 11:10 rufuspollock

Currently I'm pulling it once (per update) from raw and then converting it to json and storing it in my repo. To be honest, whether the data is in csv or json doesn't matter much to me, but of course it would be nicer and a little bit more convenient to be able to download json directly.
On the other hand if you would publish the data as an npm package then I could simple require that package as an dependency and never worry about needing to update the data myself. And that would also make it easier for other developers to include your data in other projects (if there are such developers that wouldn't like to use my module 😀 ).

fannarsh avatar Oct 15 '16 01:10 fannarsh

@fannarsh that's really useful clarification.

And the idea about building to a npm package is a really nice idea. If we did to npm package i assume you'd want it as JSON or would CSV work (or both)? And would it literally just be the raw data in there (plus datapackage.json)?

rufuspollock avatar Oct 16 '16 11:10 rufuspollock

I would just add the JSON, and I wouldn't bother adding the datapackage.json either since its my understanding it's metadata about the CSV data/structure. If I would need to do any edits to the data or bigger work I would probably clone the repo and work from that. The npm package itself would never be a source for doing actual editing work on the data.

fannarsh avatar Oct 16 '16 16:10 fannarsh

I added a pull request with an npm package definition that would be good enough for my use case. https://github.com/datasets/country-list/pull/6

fannarsh avatar Oct 16 '16 16:10 fannarsh

@fannarsh reviewing the PR and thank-you for this

Just wondering atm about whether we want in this repo or in a separate repo - we may not want node stuff in here but rather in a small separate repo.

rufuspollock avatar Oct 24 '16 12:10 rufuspollock

@fannarsh just to say i'm working on this - classic coder issue of trying to make something generic to generate these automatically. If i don't get this sorted soon I'll just take your version and post ...

rufuspollock avatar Nov 18 '16 20:11 rufuspollock

hehe, no sweat, I recognise that problem :)

fannarsh avatar Nov 18 '16 23:11 fannarsh

@fannarsh @jochakovsky we have something to check out now - a npm/node branch in this repo and a published package on npm

https://github.com/datasets/country-list/tree/npm

https://www.npmjs.com/package/@datasets/country-list

It would be great to get your feedback and thoughts here, especially as going forward we are committed to doing this node packaging for more and more of the core datasets. For example:

  • What would be the best way to generate these node packages so they are useful to other folks in node community both end user developers and other package maintainers? For example, should it just be totally minimal i.e. just json or should it include a minimal API (for the moment we've add a small API inspired by what you did @fannarsh).
  • What other datasets are a priority for node packaging? Currency codes? Country flags?
  • Should we publish inside a npm org e.g. @datasets as we did with this one or is it better to have something like country-list-data

Any other comments or thoughts warmly welcome.

Aside: we have never forgotten about this. It has just taken a crazy long time for various reasons including some classic yak-shaving: we've been doing a major reboot of https://data.okfn.org and https://datahub.io/ -- which have merged together. Part of that is being able to do a lot of automation ranging from generating the json to generating node packages from data packages ...).

rufuspollock avatar Dec 17 '17 10:12 rufuspollock

Hi @rufuspollock,

I like what you guys have done so far.

I think that you should keep the packages under @datasets org and I like the idea of providing a minimal api like you have done with @datasets/country-list. However I would like to see another package that would be data only, could be named @datasets/country-list-data so that f ex. me could require only the data and keep my package up to date in a easy way. The @datasets/country-list package could even

// https://github.com/datasets/country-list/blob/npm/index.js
let countryList = require('@datasets/country-list-data');
// instead of 
let countryList = require('./data.json');

But that all depends on how you want to maintain the packages/repos.

If you would not want to release a pure @datasets/country-list-data package then I would suggest adding data exports to the api so that I could access the data and use directly.

rawData = countries.data()

Regarding other datasets, I would say bring them all on 😄 but in reality, workwise it would maybe make sense to start of with *-data packages without the minimal api since that could take more time to figure out. But if the *-data packages are out there then it allows other developers to pick up the thread and do something useful/fun with the data. And churning out packages containing the datasets could just be a question of right tooling.

fannarsh avatar Sep 05 '18 23:09 fannarsh

@fannarsh cool and really useful feedback. Would you like to contribute here - we could give you perms. Also folks on our team like @zelima and @svetozarstojkovic can provide support and guidance :smile:

rufuspollock avatar Jan 31 '19 19:01 rufuspollock