li
li copied to clipboard
Add populations to Wikidata
Is there anyone who has the right to edit Wikidata articles for counties? Basically it means 50+ edits on Wikidata which means the account is "autoconfirmed".
Right now I have that level, but it's quite tedious to fix all populations alone and I'd be happy if someone could help me.
Some of the locations are less important and everyone can edit them, like these ones in Panama: https://www.wikidata.org/wiki/Q217138
Other ones are in the "top 3000" items and only people with confirmed accounts can edit them. But basically editing the less important features would allow someone to get to this autoconfirmed level.
So who would like to help by entering population informations?
Need to add missing in:
- [x] Slovenia
- [x] Ireland
- [x] Poland
- [x] Lithuania
- [ ] South Korea
- [x] Panama
- [ ] Sebastopol, Russia
Populations on Wikipedia: https://en.wikipedia.org/wiki/Provinces_of_Panama#Provinces
Alternative source from @ciscorucinski which has it at county level: https://www.citypopulation.de/en/panama/admin/
Yes, sometimes I'm using Wikipedia as a source, but still it needs to be added manually. Citypopulation.de doesn't allow downloading the map data. Also we prefer official data with matching government dataset.
Here's population information down to corregimientos level (the granularity at which we get COVID data): https://github.com/EricLuceroGonzalez/Panama-Political-Division Population is presumably from the 2010 census, but it would have to be verified.
Which raises the question: how are we validating any of this?
It seems like an intimidating interface with wikidata for adding information
@ciscorucinski you mean how to add population data?
Basically:
- click "add statement" at the bottom
- select populations
- enter the value
- add qualifier
- select point in time
- enter year
- add reference
- select URL or type P4656 for wikipedia import
- paste URL
- save
If you do this in multiple steps, it's quite easy to get over the 50 required edits to get your account autoconfirmed. For example add - publish - add qualifier - publish - add reference - publish can get you 4 edits. So with 13 regions you are over 50 edits :-)
Done for Panama's provinces.
There are ways of doing this via Google Sheets and a tool called QuickStatements. Since we are only concerned with one type of data import process, we should be able to create a fairly standardized process within a spreadsheet.
Google Sheets + QuickStatements: https://www.youtube.com/watch?v=bUpJN4IklJ8 OpenRefine: https://www.youtube.com/watch?v=wfS1qTKFQoI
@ciscorucinski if you can mass import using this tool it'd be great! So far I've done all my edits by hand.
@hyperknot you can! But I am uncertain how to go about doing it for this data right now
Luckily we don't have that many missing populations. If we encounter an other country with a lot, I'll comment here.
is there an easy way to find what is missing?
Ones without population in this JSON: https://raw.githubusercontent.com/hyperknot/country-levels-export/master/iso2.json
Portugal seems like a good candidate: https://github.com/hyperknot/country-levels-export/blob/master/docs/iso2_list/PT.md
We need to add: Slovenia, Ireland, Poland, and Lithuania.
I fixed Ireland and Poland. What is missing in Lithuania?
For Slovenia, it really needs that batch updating effort! @ciscorucinski can you help with that?
Let's create a Google Sheet, and try out a few records before mass editing. I have never edited a wikidata entry, so consider me a noob here 😅
What info is needed to identify a population point in terms of wikidata? We need Q IDs for a few datapoints, but these can be retrieved through a wikidata Chrome extension in Google Sheets.
Just datapoint names such as Country, State, and county level names should be good enough I guess??? Along with the population data and url reference
All the Q-s we need are here: https://github.com/hyperknot/country-levels-export/blob/master/docs/iso2_list/SI.md
Machine readable format is this: https://raw.githubusercontent.com/hyperknot/country-levels-export/master/iso2.json
The other side of the equation should be some government census CSV listing those populations in a CSV.
Really not ideal (has some weird character errors) but here is a CSV from the Slovenian Statistical Bureau. Data is from 2019. https://gist.github.com/qgolsteyn/145d82f984d65c34e778371a69cf5433
@qgolsteyn thanks! Do you have the source for this file? Maybe chardetect would tell us what encoding it's in.
I don't have it immediately, but will get the source to you by this evening. I also update the list with additional countries that need population info
Thanks!
My appologies, here is Slovenia's data: https://pxweb.stat.si/SiStatDb/pxweb/en/10_Dem_soc/10_Dem_soc__05_prebivalstvo__10_stevilo_preb__20_05C40_prebivalstvo_obcine/05C4002S.px/table/tableViewLayout2/
Portugal is done, as is Colombia. Working on Slovenia next.
I think Slovenia is done, but I got "errors" on their tool, despite there being hundreds of successful edits....
EDIT
Because I tried to add the atomic number of a municipality among other atrocities 😆 Anyway, it's processing now, should be done soon.
Lithuania should be done...after much struggle. I'm off for the rest of the night.
@shaperilio thanks so much, I've updated the file already but I'll make a new processing for Lithuania as well.
Korea should be up to date now
@hyperknot , is this issue still open? Wondering what the current status is. Cheers, z