awesome-data
awesome-data copied to clipboard
Historical GDP
Angus Maddison has a great datasets of historical GDP and population. Package that.
Can get from http://datahub.io/dataset/econonomic-history-gdp-historical-estimates
Note: may be IP issues - should flag that in the License section
Could start by just getting this data into a google doc and nice and tidy.
Original files have been removed, apparently. Will work on the original .xls files.
Historical GDP Per Capita Historic GDP
To improve the quality of these data packages, I would recommend removing the first two rows of these datasets. If this is needed, I will take care of this later on.
For now, I will take care of the remaining parts of the process to move this package forward.
Issue: http://prntscr.com/ay3cs3
As soon as solved, I can published it. Already have the PR ready.
@gsilvapt can you please do http://data.okfn.org/doc/core-data-curators#3-quality-assurance - i.e. post links here.
@rgrp I cannot move the repositories to the /datasets organization because I don't have admin access. At least it is what GitHub says. The links to the repositories are posted in a previous comment but here they are again:
https://github.com/gsilvapt/historical-gdp https://github.com/gsilvapt/historical-gdp-per-capita
Hi @gsilvapt. This is the link for okfn tools and plugins: http://data.okfn.org/tools Follow the Validate link and paste link to your repository there. If it is valid it should look something like this Follow the View link and paste link to your repository there. The result should look something like this After dataset will pass validation and you think it looks well, paste the validation and view links here.
Update: In your case, there should not be graph
Oh, okay. I had run the validator in my terminal. But here they are again then: Historic GDP http://data.okfn.org/tools/validate?url=https%3A%2F%2Fgithub.com%2Fgsilvapt%2Fhistorical-gdp http://data.okfn.org/tools/view?url=https%3A%2F%2Fgithub.com%2Fgsilvapt%2Fhistorical-gdp
Historic GDP Per Capita http://data.okfn.org/tools/validate?url=https%3A%2F%2Fraw.github.com%2Fgsilvapt%2Fhistorical-gdp-per-capita%2Fmaster%2Fdatapackage.json http://data.okfn.org/tools/view?url=https%3A%2F%2Fgithub.com%2Fgsilvapt%2Fhistorical-gdp-per-capita
Thanks for helping :+1:
You're welcome. I opened, few issues on your Historic GDP repository, hope it will help to prettify your dataset. Good luck!
@gsilvapt what's the logic on splitting them into two data packages? My sense is that probably having these in one (if they come from one data source) is probably better tbh (though two different data files in the data package)
@rgrp Because the original sheet had 3 separate tabs and they are different types of data. One is GDP and one is population numbers.
@gsilvapt as discussed in channel let's have 3 files.
Also in terms of naming i would suggest gdp-historical just because the ordering makes nicer sense (its GDP first then its the historical version of GDP).
Yes, taking care of this for ages :stuck_out_tongue: Let's hope this last try actually works!!
In regards to this data package, here you have it:
Validator: http://data.okfn.org/tools/validate?url=https%3A%2F%2Fgithub.com%2Fgsilvapt%2Fhistorical-gdp
Viewer: http://data.okfn.org/tools/view?url=https%3A%2F%2Fgithub.com%2Fgsilvapt%2Fhistorical-gdp
I have the Pull Request ready with the updated lists on registry/data
. Let me know when I should do it, please.
I forgot to come back with extra news. The repository has progressed significantly. However, it is not yet perfect. In this issue we have been discussing how to improve the quality of the package.
@rgrp, some years have strings instead because they represent year intervals. Like
Annual compound Growth rate 1990-2008
As for
Numbers have "," for decimal separator rather than "."
I will have to double-check. The professor who posted this online was European and we tend to use commas instead of dots. Dots makes more sense to me but didn't even look at that. What should we do about the year intervals though?
Latest commit changes all "," to "." . As for the time intervals, I am still waiting for some guidance there.
@gsilvapt re time intervals it is a tough one. If we leaves strings like this thought it breaks a lot of stuff. My suggestion is for these one pick a date in the middle and add a column called "notes" and put the interval description in the notes column.
@rgrp I was trying to have a look at that and then I noticed the middle value is actually a coincident value with other years - namely 1999. What to do then?
@gsilvapt how common are these "averages"? one option is to delete them (or put them in a separate special file).
@rgrp Every "growth" indicator has this time interval and it is the only one
@gsilvapt ok - i would just remove since it is probably computable from other data (and note this in the ## Data
section of the README (that we removed it).
That has been taken care of:
Validator: http://data.okfn.org/tools/validate?url=https%3A%2F%2Fgithub.com%2Fgsilvapt%2Fhistorical-data
Viewer: http://data.okfn.org/tools/view?url=https%3A%2F%2Fgithub.com%2Fgsilvapt%2Fhistorical-data
@pdehaye would you like to review?
This dataset has not been reviewed and I'd like to transfer ownership when you guys can review this. Thanks!
@gsilvapt I suggest to sort Year column too.. like:
Afghanistan,1998,"0,0582390906"
Afghanistan,2002,"0,2515428942"
Afghanistan,2004,"0,0769589935"
instead of
Afghanistan,1998,"0,0582390906"
Afghanistan,2004,"0,0769589935"
Afghanistan,2002,"0,2515428942"
Also, I see values of "Value" column are strings instead of number - they are inside double quotes. should they be so? (I'm not sure this needs to be fixed - just saying)
Thanks for the suggestions. I will work on a fix for this :+1: