covid
covid copied to clipboard
Is similar data available for other countries (number of deaths by death date)?
Hej Adam,
Thanks for your awesome work, it's still one of the few graphs I browse regularly to keep me updated of the COVID situation in Sweden.
I find really shameful that the reporting delay is so bad and that it's gotten worse during this second wave. But I actually have no certainty that it's better in other countries. I've been looking for datasets that would enable me to determine if that's the case, without much success so far. Have you found any?
Long time since I looked, but back then I couldn't find it anywhere. I most definitely share your concern though. If you find any please do post it here.
I asked on a Nordic datajournalist and was told that Finland started publishing deaths by death date last fall. I haven't been able to find the raw data yet but here is an example: https://sampo.thl.fi/pivot/prod/en/epirapo/covid19case/fact_epirapo_covid19case?column=measure-492118&row=dateweek20200101-508804L
I guess by getting the raw data for each publishing day, we could recreate the same data you use from FHM.
Thanks! Yes indeed! It seems like their API creates csv links like this: https://sampo.thl.fi/pivot/prod/sv/epirapo/covid19case/fact_epirapo_covid19case.csv?row=dateweek20200101-508804L&column=measure-492118&
Should be able to set up a downloader and host the data on this site. Very nice. Think I'll have some time to implement it tonight.
Great, I can try and write them an FOIA request to see if they have some older files. EDIT: sent.
That would be awesome. Starting to download now would mean one could only see the extent of the reporting delay from today onwards.
Hi @juhanisaa, I see that you have collected a lot of Finnish COVID data and are explaining it here.
I am tagging you here because we are currently looking for older versions of the list of deaths by death date published through THL's API (this call). Is this something that you might have saved on your servers? I had a look but couldn't find it.
Thanks in advance.
I believe we haven't stored that data, but I'll have to confirm.
Thanks @juhanisaa! Don't hesitate to tell us if you get your hands on it!
@adamaltmejd, I found another country really interesting to compare with! The UK has death data by death date!
Very nice! Thanks. I'll try to find some time to put together crawlers. Sorry for not doing it yet.
No worries, I'll start downloading them manually this week so we don't lose more data. I'm also sending an FOIA request to the British authorities to try to get the old files. From what I could see, some months are available on Github.
Very interesting! Keep me posted :)
Ok I've just set up the code to download data from the finish and UK repos. Lets see if it works :)
In this repo, we have bigger CSV files containing deaths by date of death among other things. Dating back to the 13th of October. Thanks @theosanderson!
Here is a repo having the period 23/08 -> 30/11. Thanks @nathanrawle!
On this repo, there is a file named death_data.csv updated everyday since the 26th of October. Thanks @rvaughan!
Finally, in this repo, the same data is present since the 8th of December. Thanks @msleigh!
Note that the first one distinguishes by county and the other two by nation. Maybe we want to download this file in the future? In any case, it seems to be fetched automatically on the first aforementioned repo.
For the UK you can download data going back further with Archive
in https://coronavirus.data.gov.uk/details/download (didn't exist when I made my repo)
In this repo, we have bigger CSV files containing deaths by date of death among other things. Dating back to the 13th of October. Thanks @theosanderson!
Here is a repo having the period 23/08 -> 30/11. Thanks @nathanrawle!
On this repo, there is a file named death_data.csv updated everyday since the 26th of October. Thanks @rvaughan!
Finally, in this repo, the same data is present since the 8th of December. Thanks @msleigh!
Note that the first one distinguishes by county and the other two by nation. Maybe we want to download this file in the future? In any case, it seems to be fetched automatically on the first aforementioned repo.
No problem. I obtained the data from the API @theosanderson mentioned, from which you can access whichever metrics you want as they were published on x date in the past backdated up to 23 August. Releases from 31 Nov up to yesterday can be scraped in the same way now.
Fantastic, thanks everyone! I'll put together a dataset with daily releases to measure reporting delay and to evaluate our model on. Exciting :)
Be aware that the release for 7/10/2020 is missing from https://coronavirus.data.gov.uk/details/download and will return HTTP200 with no content
Hej @adamaltmejd, would you like some help to convert the new data to the same format you feed the current graphs? I don't have any experience with R but it shouldn't be too hard to build on your code. I just need to get a dev env running.
After getting that working and maybe a graph comparing Sweden's delay with the others, I thought it could be interesting to write a blog post about the findings.
Feel free to explore it if you want! I won't have time to do anything for a week or two.
Made a version of my graph for the UK. Can be seen here: https://adamaltmejd.se/covid/deaths_lag_uk.png
Awesome! I actually played with it myself but I got so many small bugs with R trying to recompile the delta-t for the data since last Summer, I gave up at some point.
What's your early analysis? It seems like the British data has some interesting constants (no same day data, no data on Sundays nor on public holidays) that are similar to the Swedish one.
But besides that, there is just so much less blue on the UK's graph, they seem to be reporting the deaths many times faster and the ones over 14 days late are anecdotical.
It's impossible to know the causes for such a difference at that point. Difference in death confirmation method? Different delays in reporting? Priority given to accuracy versus speed?
But it would be interesting to discuss it with journalists and see if they can investigate and maybe question FHM about it.
Agreed it is super interesting. Agree with your observations too. My bet on the main reason for the big delays in Sweden is that we have a system in place already for death reporting at the national level - and that system has been used also for Covid. The problem is that it wasn't designed to be fast. The doctor who signs the death certificate has something like two weeks to send it in. So what has always worked well now has a speed problem that is not easy to fix.
That makes sense although for this as for much of the Swedish government with COVID, it's hard to justify that when other countries were able to do better.
How close would you say you are from generating the other graphs (reporting delay) and some with the Finnish data? Should we wait before sending this to journalists?
I think it would be great to send that to the data team at DN which has a graph similar to yours with deaths by death date. They would be able to double-check the data and code. Emanuel Karlsten would also probably be interested. Do you have time to do it? I could write a draft if you want.
Adding Finland is easy now, but the issue is that we do not have data going back in time and I haven't been collecting for long. Or did you manage to get archived data?
Seems also there is a bug with the finish data, for some reason its stopped collecting deaths and only collect cases for the last five days. https://github.com/adamaltmejd/covid/commit/c26de6e07f7937844b349749bd8282ec7b80023d
Trying to fix now...
Last time I checked, I couldn't find any older data. And the agency didn't save it either 🤦🏻♂️.
Maybe we should focus on the UK for now.
No idea why but seems we lost 6 days of downloads... Really unfortunate.
Here is a proposal for an e-mail to journalists:
Hej,
Jag kontaktar er eftersom jag tror att vi har upptäckt något som kan vara värt ert intresse angående COVID-19 och hur pandemin hanteras av regeringen.
Under det senaste året har Adam Altmejd, forskare på handelshögskolan i Stockholm, sammanställt Sveriges dödsfall efter dödsdatum och publicerat visualiseringar som visar hur länge det tar för dödsfall att rapporteras. De ligger på adamaltmejd.se/covid och källkoden som genererar och uppdaterar visualiseringarna finns på github.com/adamaltmejd/covid. Allt är baserat på öppna data från Folkhälsomyndigheten.
Dödsfall rapporteras ofta några dagar sent och under andra vågen har det ökat mycket, en majoritet rapporterades över 7 eller 14 dagar sent.
Vi tyckte att det var konstigt så vi letade efter andra länders data för att kunna jämföra. Tyvärr publicerar väldigt få länder dödsfall efter dödsdatum men vi hittade två: Storbritannien och Finland.
Här är visualiseringarna för Sverige och Storbritannien bredvid varandra. Som ni kan se är skillnaden mycket stor. Det tar knappt några dagar och nästan aldrig över 7 dagar i Storbritannien.
Vi kan inte veta varför skillnaden är så stor och därför kontaktar vi er som är professionella journalister. Om ni tycker att det är relevant hoppas vi att ni kan kolla det här djupare och kanske ställa frågor kring det till de relevanta makthavarna.
Allt vårt arbete kring detta ligger på Github.