COVID-19 icon indicating copy to clipboard operation
COVID-19 copied to clipboard

Ukraine data patched using WHO data

Open CSSEGISandData opened this issue 1 year ago • 6 comments

Dear all,

We were made aware that WHO has continued to update their published COVID-19 data while the national source has gone offline. We have swapped our sourcing to use WHO and have corrected our historical data. MR #5882 has patched our data from 2/25/2022 to present. As noted in our README, consistent with the designations of the United States Department of State, our total for Ukraine is inclusive of Sevastopol and the Crimea Republic (sourced from the Russian government (source)). To create our updated data, we have used data from the World Health Organization and our daily reports to sum Ukraine, Sevastopol, and Crimea Republic for the total cases and deaths for Ukraine. For the daily reports, the difference between our time series and the WHO time series data has been placed into a new "Unknown, Ukraine" entry.

A consequence of having been stale for Ukraine since 3/1/2022 is that our data for Sevastopol and Crimea Republic was also not being collected and published. Including these updated figures now results in a substantial spike in cases and deaths for Ukraine. We have mitigated this spike to the best of our ability by back distributing Sevastopol and Crimea Republic data as far back as we are able (6/13/2022). We hope that this is not too disruptive.

Please let us know any questions.

7/13/2022 8:00 AM BST: We are aware of an issue with the correction from 6/13/2022 to present and are working on a correction.

7/13/2022 6:00 PM BST: The above error has been patched.

CSSEGISandData avatar Jul 12 '22 20:07 CSSEGISandData

Hello, Thank you very much for this patch! I have a question about how this affects data from Crimea.

It seems that the Russian government dashboard you are using as a source for Russia data also includes COVID cases in Crimea. Are these numbers already being subtracted from the Russian total?

eleanorlutz avatar Jul 14 '22 13:07 eleanorlutz

Hello @eleanorlutz we can confirm that these numbers are already being subtracted from the Russian source. Data for Crimea and Sevastopol will be sourced from the Russian dashboard moving forward.

CSSEGISandData avatar Jul 18 '22 14:07 CSSEGISandData

Thank you very much for the clarification!

eleanorlutz avatar Jul 18 '22 15:07 eleanorlutz

hi. thanks, as always, for all the data.

an oddity: it seems like every (?) .csv file now ends with a "zero-case" "Unknown,Ukraine" line. e.g.:

05-10-2020.csv:,,Unknown,Ukraine,2020-05-11 02:32:30,,,0,0,0,0,"Unknown, Ukraine"

maybe these shouldn't be there?

cheers!

greg-minshall avatar Jul 26 '22 04:07 greg-minshall

@greg-minshall thank you for your comment. We have had to add an "Unknown, Ukraine" entry into our database due to the WHO COVID-19 dashboard only providing national level case totals while our historical data sourced from the Ukrainian dashboard was at the provincial level. As the WHO total cannot be placed into any of the provinces, we've placed the difference between our historic total and the WHO national total in the "Unknown, Ukraine" entry. For consistency and to ensure the script generating the dashboard visualization runs properly, we've added this entry into the historic daily reports.

CSSEGISandData avatar Jul 27 '22 08:07 CSSEGISandData

@CSSEGISandData thanks very much for the reply. i understand the addition of "Unknown,Ukraine" after the WHO data took over the previous reporting. (and, i applaud using it to record the difference between that and the sum of the province-level data as of ~ 01 Mar 2022!)

what i would suggest is removing "Unknown,Ukraine" from the .csv files prior to the inclusion of province-level data. i.e., (if i'm getting it right), from 22 Jan 2020 (a time when there is, in fact, not even national-level Ukrainian data available) through 31 May 2020 (as province-level data appears to first appear on 01 Jun 2020). that might be "cleaner" (ycocmv -- "your concept of cleaner may vary" :)

(and, if removing it up until the province-level data stopped being available -- when you switched to WHO data -- maybe 01 Mar 2022? -- that might be even cleaner; but, i can imagine how a "dashboard", or the user thereof, might find that confusing.)

cheers, Greg

greg-minshall avatar Jul 27 '22 19:07 greg-minshall