covid-api icon indicating copy to clipboard operation
covid-api copied to clipboard

Add standardised country code

Open MatMoore opened this issue 4 years ago • 2 comments

Forking this discussion from https://github.com/andreagrandi/covid-api/pull/42

Problem 1: Country identifiers are inconsistent

Currently, the country_region field is not a very reliable identifier.

One example is the UK, which has been reported in at least two different ways:

Currently, we ignore the change in column name but not the change in column value.

Problem 2: There is no way to consistently refer to the scope of the data

Originally most data was country level, except for China, which was reported in different regions.

This is getting more detailed as time goes on, so it means that if you make the same API call, you can get different information back depending on the date range you are querying.

For the UK, there is now additional rows with country_region=United Kingdom, but province_state set. To get just the UK, you need to filter extra rows that include British overseas territories and crown dependencies.

For the US, they have started reporting county-level information instead of country-level.

Proposal

  1. We use this lookup table ISO alpha2/3 country codes to each row, and make them filterable, so that users can filter by country more reliably. We should make this a new column for backwards compatibility.

  2. We add a new column called something like "scope", with values country_region, province_state and admin2. We can infer the scope of each record by looking at which columns are filled in. The code is sort of doing this already to check uniqueness, but the result is not stored. This way, API users should be able to write a query with scope and country parameters set and always get a consistent response back.

MatMoore avatar Apr 05 '20 10:04 MatMoore

I could start working on this this week, as long as there's nothing more urgent that needs doing for the MVP?

MatMoore avatar Apr 05 '20 11:04 MatMoore

I could start working on this this week, as long as there's nothing more urgent that needs doing for the MVP?

Looking at #20 I think the only missing thing for the MVP is to improve the documentation and add a few usage examples. I will take care of this today, so feel free to start working on #43

As a next step, I would like to start working on "Italian Protezione Civile" source of data.

andreagrandi avatar Apr 05 '20 11:04 andreagrandi