covid icon indicating copy to clipboard operation
covid copied to clipboard

Is similar data available for other countries (number of deaths by death date)?

Open PierreMesure opened this issue 4 years ago • 35 comments

Hej Adam,

Thanks for your awesome work, it's still one of the few graphs I browse regularly to keep me updated of the COVID situation in Sweden.

I find really shameful that the reporting delay is so bad and that it's gotten worse during this second wave. But I actually have no certainty that it's better in other countries. I've been looking for datasets that would enable me to determine if that's the case, without much success so far. Have you found any?

PierreMesure avatar Jan 10 '21 19:01 PierreMesure

Long time since I looked, but back then I couldn't find it anywhere. I most definitely share your concern though. If you find any please do post it here.

adamaltmejd avatar Jan 12 '21 13:01 adamaltmejd

I asked on a Nordic datajournalist and was told that Finland started publishing deaths by death date last fall. I haven't been able to find the raw data yet but here is an example: https://sampo.thl.fi/pivot/prod/en/epirapo/covid19case/fact_epirapo_covid19case?column=measure-492118&row=dateweek20200101-508804L

I guess by getting the raw data for each publishing day, we could recreate the same data you use from FHM.

PierreMesure avatar Jan 18 '21 09:01 PierreMesure

Thanks! Yes indeed! It seems like their API creates csv links like this: https://sampo.thl.fi/pivot/prod/sv/epirapo/covid19case/fact_epirapo_covid19case.csv?row=dateweek20200101-508804L&column=measure-492118&

Should be able to set up a downloader and host the data on this site. Very nice. Think I'll have some time to implement it tonight.

adamaltmejd avatar Jan 18 '21 09:01 adamaltmejd

Great, I can try and write them an FOIA request to see if they have some older files. EDIT: sent.

PierreMesure avatar Jan 18 '21 09:01 PierreMesure

That would be awesome. Starting to download now would mean one could only see the extent of the reporting delay from today onwards.

adamaltmejd avatar Jan 18 '21 09:01 adamaltmejd

Hi @juhanisaa, I see that you have collected a lot of Finnish COVID data and are explaining it here.

I am tagging you here because we are currently looking for older versions of the list of deaths by death date published through THL's API (this call). Is this something that you might have saved on your servers? I had a look but couldn't find it.

Thanks in advance.

PierreMesure avatar Jan 18 '21 11:01 PierreMesure

I believe we haven't stored that data, but I'll have to confirm.

juhanisaa avatar Jan 18 '21 19:01 juhanisaa

Thanks @juhanisaa! Don't hesitate to tell us if you get your hands on it!

@adamaltmejd, I found another country really interesting to compare with! The UK has death data by death date!

PierreMesure avatar Jan 24 '21 23:01 PierreMesure

Very nice! Thanks. I'll try to find some time to put together crawlers. Sorry for not doing it yet.

adamaltmejd avatar Jan 25 '21 07:01 adamaltmejd

No worries, I'll start downloading them manually this week so we don't lose more data. I'm also sending an FOIA request to the British authorities to try to get the old files. From what I could see, some months are available on Github.

PierreMesure avatar Jan 25 '21 13:01 PierreMesure

Very interesting! Keep me posted :)

adamaltmejd avatar Jan 25 '21 13:01 adamaltmejd

Ok I've just set up the code to download data from the finish and UK repos. Lets see if it works :)

adamaltmejd avatar Jan 26 '21 13:01 adamaltmejd

In this repo, we have bigger CSV files containing deaths by date of death among other things. Dating back to the 13th of October. Thanks @theosanderson!

Here is a repo having the period 23/08 -> 30/11. Thanks @nathanrawle!

On this repo, there is a file named death_data.csv updated everyday since the 26th of October. Thanks @rvaughan!

Finally, in this repo, the same data is present since the 8th of December. Thanks @msleigh!

Note that the first one distinguishes by county and the other two by nation. Maybe we want to download this file in the future? In any case, it seems to be fetched automatically on the first aforementioned repo.

PierreMesure avatar Jan 27 '21 16:01 PierreMesure

For the UK you can download data going back further with Archive in https://coronavirus.data.gov.uk/details/download (didn't exist when I made my repo)

theosanderson avatar Jan 27 '21 17:01 theosanderson

In this repo, we have bigger CSV files containing deaths by date of death among other things. Dating back to the 13th of October. Thanks @theosanderson!

Here is a repo having the period 23/08 -> 30/11. Thanks @nathanrawle!

On this repo, there is a file named death_data.csv updated everyday since the 26th of October. Thanks @rvaughan!

Finally, in this repo, the same data is present since the 8th of December. Thanks @msleigh!

Note that the first one distinguishes by county and the other two by nation. Maybe we want to download this file in the future? In any case, it seems to be fetched automatically on the first aforementioned repo.

No problem. I obtained the data from the API @theosanderson mentioned, from which you can access whichever metrics you want as they were published on x date in the past backdated up to 23 August. Releases from 31 Nov up to yesterday can be scraped in the same way now.

nathanrawle avatar Jan 27 '21 18:01 nathanrawle

Fantastic, thanks everyone! I'll put together a dataset with daily releases to measure reporting delay and to evaluate our model on. Exciting :)

adamaltmejd avatar Jan 27 '21 18:01 adamaltmejd

Be aware that the release for 7/10/2020 is missing from https://coronavirus.data.gov.uk/details/download and will return HTTP200 with no content

nathanrawle avatar Jan 27 '21 21:01 nathanrawle

Hej @adamaltmejd, would you like some help to convert the new data to the same format you feed the current graphs? I don't have any experience with R but it shouldn't be too hard to build on your code. I just need to get a dev env running.

After getting that working and maybe a graph comparing Sweden's delay with the others, I thought it could be interesting to write a blog post about the findings.

PierreMesure avatar Feb 03 '21 10:02 PierreMesure

Feel free to explore it if you want! I won't have time to do anything for a week or two.

adamaltmejd avatar Feb 03 '21 10:02 adamaltmejd

Made a version of my graph for the UK. Can be seen here: https://adamaltmejd.se/covid/deaths_lag_uk.png

adamaltmejd avatar Mar 05 '21 20:03 adamaltmejd

Awesome! I actually played with it myself but I got so many small bugs with R trying to recompile the delta-t for the data since last Summer, I gave up at some point.

PierreMesure avatar Mar 05 '21 23:03 PierreMesure

What's your early analysis? It seems like the British data has some interesting constants (no same day data, no data on Sundays nor on public holidays) that are similar to the Swedish one.

But besides that, there is just so much less blue on the UK's graph, they seem to be reporting the deaths many times faster and the ones over 14 days late are anecdotical.

PierreMesure avatar Mar 05 '21 23:03 PierreMesure

It's impossible to know the causes for such a difference at that point. Difference in death confirmation method? Different delays in reporting? Priority given to accuracy versus speed?

But it would be interesting to discuss it with journalists and see if they can investigate and maybe question FHM about it.

PierreMesure avatar Mar 05 '21 23:03 PierreMesure

Agreed it is super interesting. Agree with your observations too. My bet on the main reason for the big delays in Sweden is that we have a system in place already for death reporting at the national level - and that system has been used also for Covid. The problem is that it wasn't designed to be fast. The doctor who signs the death certificate has something like two weeks to send it in. So what has always worked well now has a speed problem that is not easy to fix.

adamaltmejd avatar Mar 10 '21 12:03 adamaltmejd

That makes sense although for this as for much of the Swedish government with COVID, it's hard to justify that when other countries were able to do better.

How close would you say you are from generating the other graphs (reporting delay) and some with the Finnish data? Should we wait before sending this to journalists?

I think it would be great to send that to the data team at DN which has a graph similar to yours with deaths by death date. They would be able to double-check the data and code. Emanuel Karlsten would also probably be interested. Do you have time to do it? I could write a draft if you want.

PierreMesure avatar Mar 11 '21 09:03 PierreMesure

Adding Finland is easy now, but the issue is that we do not have data going back in time and I haven't been collecting for long. Or did you manage to get archived data?

adamaltmejd avatar Mar 11 '21 09:03 adamaltmejd

Seems also there is a bug with the finish data, for some reason its stopped collecting deaths and only collect cases for the last five days. https://github.com/adamaltmejd/covid/commit/c26de6e07f7937844b349749bd8282ec7b80023d

Trying to fix now...

adamaltmejd avatar Mar 11 '21 10:03 adamaltmejd

Last time I checked, I couldn't find any older data. And the agency didn't save it either 🤦🏻‍♂️.

Maybe we should focus on the UK for now.

PierreMesure avatar Mar 11 '21 10:03 PierreMesure

No idea why but seems we lost 6 days of downloads... Really unfortunate.

adamaltmejd avatar Mar 11 '21 10:03 adamaltmejd

Here is a proposal for an e-mail to journalists:

Hej,

Jag kontaktar er eftersom jag tror att vi har upptäckt något som kan vara värt ert intresse angående COVID-19 och hur pandemin hanteras av regeringen.

Under det senaste året har Adam Altmejd, forskare på handelshögskolan i Stockholm, sammanställt Sveriges dödsfall efter dödsdatum och publicerat visualiseringar som visar hur länge det tar för dödsfall att rapporteras. De ligger på adamaltmejd.se/covid och källkoden som genererar och uppdaterar visualiseringarna finns på github.com/adamaltmejd/covid. Allt är baserat på öppna data från Folkhälsomyndigheten.

Dödsfall rapporteras ofta några dagar sent och under andra vågen har det ökat mycket, en majoritet rapporterades över 7 eller 14 dagar sent.
Vi tyckte att det var konstigt så vi letade efter andra länders data för att kunna jämföra. Tyvärr publicerar väldigt få länder dödsfall efter dödsdatum men vi hittade två: Storbritannien och Finland.

Här är visualiseringarna för Sverige och Storbritannien bredvid varandra. Som ni kan se är skillnaden mycket stor. Det tar knappt några dagar och nästan aldrig över 7 dagar i Storbritannien.

Vi kan inte veta varför skillnaden är så stor och därför kontaktar vi er som är professionella journalister. Om ni tycker att det är relevant hoppas vi att ni kan kolla det här djupare och kanske ställa frågor kring det till de relevanta makthavarna.

Allt vårt arbete kring detta ligger på Github.

PierreMesure avatar Mar 11 '21 10:03 PierreMesure