li
li copied to clipboard
Add scraper for Colombia
Location name
Colombia (COL)
Source URL
http://www.ins.gov.co/Noticias/Paginas/Coronavirus.aspx (National Institute of Health). But see below.
Notes/comments
The source URL above has a bunch of Infograms embedded. Each one can be opened in a tab, and then you can snoop the data sources using Chrome's network inspector.
Summary data
https://infogram.com/api/live/flex/5eb73bf0-6714-4bac-87cc-9ef0613bf697/c9a25571-e7c5-43c6-a7ac-d834a3b5e872?
The data is in an array of HTML chunks, e.g.:
[
"<font face=\"Montserrat, sans-serif\" color=\"#ed1e79\" style=\"font-size: 22px;\"><b>1.485</b></font>",
"<font face=\"Montserrat, sans-serif\" color=\"\" style=\"font-size: 13px;\">Casos <b>Confirmados en Colombia</b></font>",
"boyPath"
],
Shows 1,485 confirmed cases.
Number of cases by "departamento" (state)
https://infogram.com/api/live/flex/5e0d85ae-48a4-4899-a679-5ee9aab66d4b/266e0a29-b843-4891-9da4-12325531507b?
Status of positive cases (e.g. hospitalized, deceased, etc.)
https://infogram.com/api/live/flex/de2e4d7c-f649-409e-a874-a7f3f6033ef1/f9098f49-e26a-4843-8291-e78cb0d9aef0?
Breakdown by gender and age
https://infogram.com/api/live/flex/de2e4d7c-f649-409e-a874-a7f3f6033ef1/406f17bb-9a08-4b76-9984-63941d87a790?
List of cases
https://infogram.com/api/live/flex/bc384047-e71c-47d9-b606-1eb6a29962e3/664bc407-2569-4ab8-b7fb-9deb668ddb7a?
This is a table structured as an array of rows. The header row is: "ID de caso" - case ID "Fecha de diagnóstico" - date of diagnosis "Ciudad de ubicación" - city "Departamento o Distrito" - state or district (assuming that's a county) "Atención**" - status. They note that "recuperado" (recovered) requires two negative tests. "Edad" - age "Sexo" - gender "Tipo*" - type of case. "Importado" (which they define as having come from a country with confirmed COVID-19 cases) or "relacionado" (confirmed to have had contact with someone who has COVID-19) "País de procedencia" - Country considered the source of the infection for this patient
Status can be: "casa" - self-quarantining at home (I'm assuming here based on what I've seen in other Latin American countries. "fallecido" - deceased "recuperado" - recovered; requires two negative tests to confirm. "hospital" - hospitalized "hospital UCI" - intensive care
Time series and test data
https://infogram.com/api/live/flex/bc384047-e71c-47d9-b606-1eb6a29962e3/523ca417-2781-47f0-87e8-1ccc2d5c2839?
One series is total cases, deaths, and recoveries, the other one is a weekly count of tests processed and test backlog.
Additional sources
I also found some open sources in the arcGIS hub - https://hub.arcgis.com/search?categories=covid-19&collection=Dataset
You can get JSONs out of all of these.
The license on each of these implies that they are from the same government entity as the Infograms above.
There are different dataset hashes but evidently choosing which data you want is only a function of the number after the underscore.
Source of cases
https://hub.arcgis.com/datasets/esri-colombia::colombia-covid19-coronavirus-procedencia-de-los-casos/data?selectedAttribute=CASOS CSV: https://opendata.arcgis.com/datasets/3a505d6969c149f98b122fb0a6fd1e7e_4.csv
Number of confirmed cases by state
https://hub.arcgis.com/datasets/esri-colombia::colombia-covid19-coronavirus-departamento/data CSV: https://opendata.arcgis.com/datasets/ed48c4ce9ca94d5499f1c327f8f532f1_1.csv
Cases by municipality
https://hub.arcgis.com/datasets/esri-colombia::colombia-covid19-coronavirus-municipio/data CSV: https://opendata.arcgis.com/datasets/53beb24d21f146c38a42db63c92e3460_0.csv
This is the one we want; includes population, population density, total cases, total active cases, total deaths, and total recovered.
Case details
https://hub.arcgis.com/datasets/esri-colombia::colombia-covid19-coronavirus-detalle-de-los-casos/data CSV: https://opendata.arcgis.com/datasets/0e14099fac45422896d50bd52292faea_3.csv
Time series
For the country as a hole; includes new/total cases, deaths, and recoveries. https://hub.arcgis.com/datasets/esri-colombia::colombia-covid19-coronavirus-casos-diarios/data CSV: https://opendata.arcgis.com/datasets/782122624f364fbdbd7e287b96c4a358_6.csv