list
list copied to clipboard
Repository for Global.health: a data science initiative to enable rapid sharing of trusted and open public health data to advance the response to infectious diseases.
Global.health List
Global.health's mission is to enable rapid sharing of trusted and open public health data to advance the response to infectious diseases.
This repository contains the servers and scripts that support its data curation efforts.
Should you have any questions please feel free to get in touch via: [email protected]
COVID-19 Data
The data exposed on Global.health was curated using two methods. ~60,000 cases were manually curated by humans analyzing sources and inputting data into spreadsheets. This data was ported from the spreadsheets into the Curator Portal as described here. The rest of the data was automatically ingested from sources through a process described here. Each case is marked as VERIFIED
if a human has confirmed this data is valid or UNVERIFIED
if it has not yet been reviewed.
You can tell if a case was imported from the manually created spreadsheets data in a couple of ways. The case will be marked as created by [email protected]. It will also have a source URL that links to this documentation. The source URL that was used to find data about these cases can be found in the additional sources section of the detailed case view (found by clicking on the table row).
Frontends
COVID-19
Daily exports of case data
A daily export of case data can be downloaded from the data portal. The data is generated using this script, with this data dictionary.
CI/CD status
-
Docker images
-
Tests
-
Monitoring
Components
-
The data service in
data-serving/data-service
facilitates CRUD operations with the MongoDB database storing case data. -
The curator service in
verification/curator-service/api
serves as the backend for the curator portal, which enables curators to view, enter, update, and verify cases; manage data sources and their ingestion; and manage portal access. - The geocoding service geocodes locations and is used by the data service, but can be used standalone as well.
-
The curator UI in
verification/curator-service/ui
is the frontend for the curator portal.
Developer documentation
READMEs
- Getting set up
- Local development
- Production infrastructure set-up and management
- Component documentation
- Authentication & authorization
- Data ingestion functions
- Curator service
- Curator UI
- Data service
- Geocoding
- API
- Load testing
- Scripts
- Data service
- Converting legacy CSV data to schema-conformant JSON
- Converting & importing legacy data into MongoDb
- Exporting MongoDb data to CSV/JSON
- Setting up your MongoDB instance
- Curator portal
- Data service
- How do I...
- Update the case schema
- Rotate secrets
API docs
Sources licenses and terms of use
This repository and daily data exports are published under the MIT license.
Each automatically ingested data source used has a required license and terms of use attachment, forcing curators to look-up the sources they are setting-up for ingestion.
If you are the owner of a data source included here and would like us to remove data, add or alter an attribution, or add or alter license information, please open an issue on this repository and we will happily consider your request.