NCoVUtils icon indicating copy to clipboard operation
NCoVUtils copied to clipboard

Data for additional countries.

Open seabbs opened this issue 4 years ago • 7 comments

I think it makes sense to expand to more datasets now.

There has been interest in the following:

  • [ ] Burkina Faso
  • [ ] Irak
  • [ ] Democratic Republic of the Congo
  • [ ] Syria

Do you have any idea of sources @ffinger?

seabbs avatar Apr 09 '20 09:04 seabbs

I will try to find sources for those asop.

Here's an issue tracking requests for sub-national data in LMIC: https://github.com/reconhub/covid19hub/issues/5

With a dedicated spreadsheet: https://docs.google.com/spreadsheets/d/1uvg07BAmwKqLqhKvkejhkX7uvXiGCre4sz11Au3pz9Q/edit?usp=sharing

ffinger avatar Apr 09 '20 10:04 ffinger

Are the countries listed above still of interest?

@ColinFay has coded a few things for Burkina Faso: https://github.com/reconhub/covid19hub/issues/5

… and the data does seem to include confirmed cases (not just contacts). It might take some proofreading, but perhaps this can be achieved (by hand, even, if needed).

Also, @ffinger's spreadsheet mentions Switzerland: openZH has done the job, but would you like a function to get them into NCoVUtils with a function like those the package already includes? What columnes do you need beyond cases and deaths?

briatte avatar Apr 28 '20 15:04 briatte

Hi @briatte, For each country, we are looking for the following columns:

  • country
  • region
  • date
  • new reported cases on date in region
  • new reported deaths on date in region

If available also the number of newly recovered, the number of tests done, the number of positive and negative test results are very useful.

Ideally regions should be named so that they match a reference geography (if available), for instance provided in rnaturalearth::ne_states() or in a separate reference file.

@seabbs is there any other columns required or ideally provided?

ffinger avatar Apr 28 '20 16:04 ffinger

I added some additional countries and potential data sources in the spreadsheet. The data are all on HDX, but in different formats. It would be great if someone could give those a go:

  • [ ] Venezuela
  • [ ] Indonesia
  • [ ] Colombia
  • [ ] Haiti
  • [ ] Senegal
  • [ ] Mali

Links to data sources here:

https://docs.google.com/spreadsheets/d/1uvg07BAmwKqLqhKvkejhkX7uvXiGCre4sz11Au3pz9Q/edit?usp=sharing

ffinger avatar Apr 28 '20 16:04 ffinger

Also consider this source for European countries: https://github.com/ec-jrc/COVID-19/tree/master/data-by-region https://data.humdata.org/dataset/europe-covid-19-subnational-cases

ffinger avatar Apr 28 '20 16:04 ffinger

Hi, I'm trying to add Mexico to the list, we are already extracting the data from official sources

cwallaceh avatar Jul 15 '20 05:07 cwallaceh

Hi @cwallaceh - great, we'd be keen to add Mexico as well. Do you have a link to the data (and/or R code to extract and clean)? No problem if not, we are happy to do this.

I also wanted to flag that we are planning to fully replace NCoVUtils with a new package. This will have all the same functionality as NCoVUtils, and we think it will be much easier to use. It will include all the current, and some new, countries' regional data. However it is not quite CRAN-ready yet so we will advertise it more widely when it is ready.

kathsherratt avatar Jul 17 '20 05:07 kathsherratt