li
li copied to clipboard
Feature: Add "caveats" for scrapers
Description
In some scrapers, we're making justifiable assumptions about how to interpret the data (e.g., covidatlas/coronadatascraper#572 - KOR quarantines). For scrapers, we could hardcode these caveats in the scrapers, and perhaps include them in the source output, e.g.:
[
{
"county": "Los Angeles County",
"state": "California",
"country": "United States",
...
"url": "http://www.publichealth.lacounty.gov/media/Coronavirus/",
"cases": 0,
"deaths": 0,
"caveats": [
"some_data_here"
],
...
}
]
Perhaps these assumptions could be rolled up to the higher levels:
"caveats": [
"LA, CA: some_data_here",
"PA: penn. caveats here"
]
Why do you need this feature or component?
Publicize assumptions
Notes
For testing/regression, I don't think we'd need to check the caveats field, as it might change over time. One sanity check would be enough.
Yes! This came up also in the discussion of the Panama scraper, because the Panama granularity level is akin to "borroughs" (smaller than cities) and we don't have anyway to store that. So if we call them counties, that detail could go in a field like this.