covid19germany icon indicating copy to clipboard operation
covid19germany copied to clipboard

Feature Request: Age instead of RKI-Age-Groups?

Open Ohlsen opened this issue 2 years ago • 3 comments

Thanks for the great and easy to use package. I am currently looking for data on corona-cases (tested ill) in Bremen by Age (especially kids). Until now i only managed to find the RKI-Agegroups which i find too broad (especially the category 5-11 yrs, because 5 years is typically still kindergarden-age and 6+ is school-children).

Are cases per age (years) perhaps already available in the package? The Dashboard https://coronavis.dbvis.de/de/overview/dashboard/04011 has a heatmap titled "Positiv getestete nach Altersgruppen für Kreisfreie Stadt Bremen innerhalb 7 Tage pro 100k Einwohner " that gives the opportunity to change settings from RKI-agegroups to age by year. The data-source for that seems to be https://survstat.rki.de/.

Ohlsen avatar Jan 15 '22 12:01 Ohlsen

I fear we don't have this information in the broad datasets available via this package. I'm honestly surprised that age-by-year information is publicly available. I thought these details are omitted for privacy reasons.

But anyway: https://survstat.rki.de seems like a very interesting platform way beyond Corona. I'm sure it would be brilliant to have an R interface for that -- although I guess it could and should be way more general to make proper use of the API. How does that work? Is it even accessible from Unix operating systems?

A quick google search didn't give me any R package, but there's a python project by @rgieseke. Maybe he knows more and is willing to chime in?

nevrome avatar Jan 15 '22 17:01 nevrome

Yes, i also thought about turning my scripts into a proper API, but it's a lot of work and probably needs someone who really understands or has access to the underlying data structure and thinking. There are quite a few tricky things like incidence rates using recent population numbers only if a year is selected etc.

Data for Bremen in 5-year intervals is available here: https://github.com/rgieseke/opencoviddata/blob/main/data/counties/survstat-covid19-cases-sk-bremen.csv There is also a file with cases per 100.000 per calendar week. (It's updated daily with a GitHub action.)

As for R, there is a Shiny project which i think has the respective code to fetch data from the SurvStat API, i believe: https://github.com/evolutionv2/shiny-webservice/blob/master/shiny-webservice/app.R

As for the original question, another way to get the data (without a nice API) could be to create the respective query and then post-process the script with R (something like this, depending on your query and question: https://gist.github.com/hoehleatsu/f8d08bc7ad04c0c144a11589f41ca921).

This Twitter thread has a walkthrough on how to fetch data from SurvStat: https://twitter.com/AscotBlack/status/1315678941659660288

rgieseke avatar Jan 17 '22 07:01 rgieseke

Thank you both so much for you kind and very insightful replies! I will have a look at your suggestions! Have a great weekend.

PS: I also just found this here on twitter https://twitter.com/BunterLotentony/status/1484477380018130951 which contains a link to a googlesheet with data per age https://docs.google.com/spreadsheets/d/e/2PACX-1vR9gYiVeUw7l7bIlOjkfkiyLlgwYmQTgEeS_0lXrBwyrWtN1W7ewvPa8JeflJVQmYiajgwFZvr_o3xq/pubhtml#

Ohlsen avatar Jan 21 '22 15:01 Ohlsen