covid-dashboard
covid-dashboard copied to clipboard
Process data at a finer spatial granularity
The data from John Hopkins comes at the level of regions / province. We should ideally build the map at this level. Forecasting at this level would also be interesting, provided that there are enough cases (forecasting from few cases is unreliable).
This enhancement will require some work, but it seems a worthwhile addition to the site.
I think that the challenge is plotting on the map: we need to get the shape of each region / province. Maybe it's a geojson?
Here is a discussion on US states in a world map: https://community.plot.ly/t/state-boundaries-on-a-world-map-projection/11698/4
Based on the following documentation, the only predefined geometries are the countries and the US states: https://plot.ly/python/choropleth-maps/#using-builtin-country-and-state-geometries
Currently taking a look at http://www.naturalearthdata.com/downloads/110m-cultural-vectors/ @jorisvandenbossche do you think it's a good resource or would you rather recommend another resource ? (sorry for the ping!)
if we want a quick solution, what could be done would be to use the Lat / Lon info of the dataset to plot a scatter plot at each lat / lon tuple. no need for shape files there. Of course having the shapes is nicer but it's also more data for the whole page
Natural Earth indeed has States/Provinces, but the question will still be if that matches the regions as provided in the data. Is there an example of the data?
Thanks a lot for your input @jorisvandenbossche :-). An example of dataset is https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_19-covid-Confirmed.csv
Thanks for that link. So https://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-admin-1-states-provinces/ has states and provinces shapes. I can take a look tomorrow if it is relatively straightforward to match those. But eg the COVID data for the US even come per county, not per state (although it should be easy to aggregate those per state)
Thanks for taking a look. The Johns Hopkins dataset (which we are using at the moment) only has province / state information for a handful of countries (it might change in the future)
>>> countries = df['Country/Region']
>>> countries.value_counts()[:50]
US 247
China 33
Canada 12
France 9
Australia 9
United Kingdom 7
Netherlands 4
Denmark 3
Japan 1
So for now this correspondance must be checked for 8 countries. I'll try it with the US states (county-level information is great but state-level should be fine for now), since plotly's choropleth trace already knows the geometry of US states.
Related to this: #79 . We can assume that county-based data are incomplete or not reliable, so let's not use them and focus on state-level data for the US.
Also see https://github.com/CSSEGISandData/COVID-19/issues/1250
In fact, regions info are only useful for Canada, Australia and China. For the other countries, regions correspond to overseas territories.