countries icon indicating copy to clipboard operation
countries copied to clipboard

What data to add next?

Open mledoze opened this issue 10 years ago • 120 comments

I would like to discuss here the data that should be added to this repository.

A similar project like 0xJS [1] contains a lot more data such as the land area or the latitude/longitude coordinates of each country.

Is it interesting/useful to have this kind of data too?

Data that can be added:

  • land (land mass in square kilometers [3])
  • latitude (latitude coordinate of the capital [2])
  • longitude (longitude coordinate of the capital [2])
  • east (longitude of the country's eastern boundary [3])
  • north (latitude of the country's northern boundary [3])
  • south (latitude of the country's southern boundary [3])

What would you like to be added?

Please let me know in the comments.

[1] http://oxjs.org/#doc/Ox.COUNTRIES [2] source: http://opengeocode.org/ [3] source: https://oxjs.org/#doc/Ox.COUNTRIES


From the comments

  • add the type of the country (country, sovereign state, public body, territory, etc.)
  • add the land borders (done, see https://github.com/mledoze/countries/tree/v1.3)
  • add regions, provinces and cities

mledoze avatar Oct 04 '13 23:10 mledoze

It might be useful to provide the country name in the native language of the country itself (e.g. {"name": "Germany", "name_native": "Deutschland"}...

scento avatar Oct 22 '13 17:10 scento

The CLDR database of the unicode project contains Country-To-Language data, including the percent of speakers: http://www.unicode.org/cldr/charts/latest/supplemental/territory_language_information.html

scento avatar Oct 22 '13 18:10 scento

It might be useful to provide the country name in the native language of the country itself

The native name of Germany is already in 'alt-spellings'. I recognize that the name 'alt-spellings' isn't good since it contains alternative spellings and the native name of the country. So there are two solutions here:

  • either we change 'alt-spellings' to 'alt-names' and keep the native name here
  • or we keep 'alt-spellings' just for alternative spellings and create 'name-native' as you suggested.

Initially, I created this dataset with a country selector in mind [1] but it would make more sense to be able to get the native names separately. So I would choose the second option.

But the second option raises the question of how to write the native name of the country. German uses latin characters so it's easy to know that it's Germany, but what about Armenia for example which is written Հայաստան in armenian [2]? For some people it might be difficult to know that it's Armenia.

What do you think?

I know that alternative spellings and native names are missing for many countries, I'm currently working on adding them. Also, I'll add the native/official language(s) of each country.

[1] https://github.com/JamieAppleseed/selectToAutocomplete [2] http://en.wikipedia.org/wiki/Armenia

mledoze avatar Oct 23 '13 09:10 mledoze

Not all people speak English, so they might be confused while selecting their locale. It might be useful if it is possible to see the English and native version of the country name parallel in the selector.

I would recommend to provide both versions for different individual usecases.

scento avatar Oct 23 '13 17:10 scento

Right, it's valid for non english speakers.

If you want, feel free to start working on adding the native names as I'll be off for a few days.

mledoze avatar Oct 24 '13 14:10 mledoze

I think it would be great to have a way to make Countries Hierarchical and have meta data describing whether they are countries or sovereign states.

For the UK currently it says "alt-spellings":"GB,Great Britain,England,UK,Wales,Scotland,Northern Ireland".

The full name of the UK is "The United Kingdom of Great Britain and Northern Ireland". It is not a country, it is a sovereign state.

Great Britain also isn't a country, it's an island.

There are three countries in Great Britain: England, Scotland and Wales.

So the types I think needed are: Country, State, Sovereign State and potentially Nation and Union as well.

Then it would be good to have a way to specify that England is within the UK and if you also have unions that it is within the EU.

Another nice feature would be to list what land borders a country has. So you could specify that England borders Scotland and Wales for example.

stephenpaulger avatar Oct 31 '13 13:10 stephenpaulger

From https://github.com/ProGNOMmers

It would be wonderful if it would be possible to retrieve regions, provinces and cities.

Something like:

// Regions of country
// /rest/alpha2/it/regions ->
{ regions:  [ "Abruzzi e Molise",
              "Basilicata",
              "Calabria",
              "Campania",
              "Emilia-Romagna",
              "Friuli-Venezia Giulia",
              "Lazio",
              "Liguria",
              "Lombardia",
              "Marche",
              "Piemonte",
              "Puglia",
              "Sardegna",
              "Sicilia",
              "Toscana",
              "Trentino-Alto Adige",
              "Umbria",
              "Valle d'Aosta",
              "Veneto" ] }

// Provinces of region
// /rest/alpha2/it/regions/Veneto/provinces ->
{ provinces: [ "Verona", "Venezia", ... ] }

// Cities of province
// /rest/alpha2/it/regions/Veneto/provinces/Venezia/cities ->
{ cities: [ { name: "Venezia", zip_codes: [ "30121", ... , "30176" ] }, 
            { name: "Chioggia", zip_codes: [ "30015" ] },
            { name: "San Donà di Piave", zip_codes: [ "30027" ] }, 
            ... ] }

// Cities of country by name
// /rest/alpha2/it/regions/Veneto/provinces/Venezia/cities ->
{ cities: [ { name: "Venezia", zip_codes: [ "30121", ... , "30176" ] }, 
            { name: "Chioggia", zip_codes: [ "30015" ] },
            { name: "San Donà di Piave", zip_codes: [ "30027" ] }, 
            ... ] }

Cities could have metadata like f.i. zip codes, which are very useful.

It is a huge work because recording and maintaining the whole list of regions, provinces and cities for every world country is hard, but it is a good target to be accomplished by an open source project.

fayderflorez avatar Nov 02 '13 11:11 fayderflorez

@stephenpaulger

I think it would be great to have a way to make Countries Hierarchical and have meta data describing whether they are countries or sovereign states.

I agree, I'll add this to the todo. I know that many entries in the dataset are not actual contries. I wanted to provide simple and factual data about world countries but I understand that more accuracy is needed.

mledoze avatar Nov 04 '13 09:11 mledoze

@fayder

It would be wonderful if it would be possible to retrieve regions, provinces and cities.

Yes it is a huge work. First I want to continue to add more data at the country level (native and official names, official language, etc.) and add the master file as soon as possible (#12) to ease the contributions.

Thank you for your help/feedback, I appreciate it!

mledoze avatar Nov 04 '13 09:11 mledoze

For the UK currently it says "alt-spellings":"GB,Great Britain,England,UK,Wales,Scotland,Northern Ireland".

@stephenpaulger in bd22b4a97f30ead3ae55f68d2c3e9b86ba784ba7 I have removed most of the names in altSpellings, now it's just GB,UK,Great Britain.

mledoze avatar Nov 16 '13 16:11 mledoze

We can also add time zone data from http://timezonedb.com/download.

mledoze avatar Nov 17 '13 11:11 mledoze

It would be really nice if there would be also a list of states per country such as the United States states. http://en.wikipedia.org/wiki/List_of_states_and_territories_of_the_United_States

shanti2530 avatar Feb 13 '14 15:02 shanti2530

@shanti2530 yes, this has been suggested https://github.com/mledoze/countries/issues/6#issuecomment-27620009 but it has not been done yet because the work is pretty huge. Do you know a source where we can find the states for every country?

mledoze avatar Feb 13 '14 15:02 mledoze

@mledoze don't know if this is what you were looking for http://vikku.info/programming/geodata/geonames-get-country-state-city-hierarchy.htm

shanti2530 avatar Feb 13 '14 15:02 shanti2530

@shanti2530 this seems very good, thank you. I'll create an issue for this. Would you like to work on this?

mledoze avatar Feb 13 '14 15:02 mledoze

GeoJSON outlines of the countries: https://github.com/datasets/geo-boundaries-world-110m

gerbenjacobs avatar Mar 20 '14 13:03 gerbenjacobs

@gerbenjacobs yes good idea, I'll add this to the to-do

mledoze avatar Mar 20 '14 14:03 mledoze

I agree for the gerbenjacobs idea of GeoJSON outlines of the countries

oriolfg avatar Apr 01 '14 14:04 oriolfg

@mledoze don't know if it's in the scope of this project, but I would love to see financial information like GDP, GDP per capita, GNI etc. - problem with this is of course that these numbers would change every year.

matiassingers avatar Apr 28 '14 07:04 matiassingers

@matiassingers no it's not really in the scope of this project. I prefer to stick with static data that do not change. The dataset currently contains population data which are not in the scope and I would like to remove it in the near future.

Although it does not currently contains GDP data, you should check this project https://github.com/tinata/tinatapi which contains other financial data.

mledoze avatar Apr 28 '14 10:04 mledoze

@dalu the postal prefixes is a good idea!

mledoze avatar May 02 '14 08:05 mledoze

@dalu you are saying that postal services want the native country name instead of the country postal prefix?

mledoze avatar May 02 '14 13:05 mledoze

I would like to inform you that I am about to remove population data because they require frequent updates to stay relevant.

I recently added CONTRIBUTING explaining the contributions rules of this project. Population data do not follow these instructions.

mledoze avatar May 06 '14 16:05 mledoze

acknowledged

fayderflorez avatar May 07 '14 12:05 fayderflorez

How about the address format, from the page mentioned above: http://en.wikipedia.org/wiki/Address_(geography)

This may be fairly difficult to do as it requires some pseudo templating language, so say for US: "addressFormat": "{{name}}\n{{houseNumber}} {{street}}\n{{locality}}\n{{city}}\n{{postalCode}}"

And would need agreement on the labels used...

tdegrunt avatar May 27 '14 19:05 tdegrunt

Hi, nice project! Thanks.

Something that would be useful to me is to know if a country is in the European Union. (https://en.wikipedia.org/wiki/Member_state_of_the_European_Union)

This information is needed when you are a company in the EU dealing with international customers. If you charge VAT or not depends on whether your customer is in the EU or not.

If you are interested in including this information I could setup a pull-request

wires avatar Jun 24 '14 09:06 wires

@tdegrunt yes it is indeed a difficult task to do, but @hexorx managed to do it in his countries repository: https://github.com/hexorx/countries/blob/master/lib/data/countries.yaml

mledoze avatar Jun 24 '14 09:06 mledoze

@0x01 yes I'm interested in including this information. Could you please add it as extra data in the data folder using [cca3].json file names?

Thank you!

mledoze avatar Jun 24 '14 09:06 mledoze

@mledoze: I can do that (in data folder), but I think it makes more sense to put it into the main file. There is very little data added, basically a boolean whether or not it's an EU member state (and I'll leave out the field if it's false)

wires avatar Jul 01 '14 13:07 wires

@0x01 you are right that this represent little data but it would be useful for only 10% of the countries (26 member state of the EU out of 251 "countries" in the dataset).

Moreover, the EU is categorized as a supranational union and it exists many other unions in the world (see [1]), so as not to add many booleans in the main file, I prefer to add this data in separate files.

[1] http://en.wikipedia.org/wiki/Political_union#Supranational_and_continental_unions

mledoze avatar Jul 01 '14 14:07 mledoze