li icon indicating copy to clipboard operation
li copied to clipboard

Feature: Add "caveats" for scrapers

Open jzohrab opened this issue 4 years ago • 1 comments

Description

In some scrapers, we're making justifiable assumptions about how to interpret the data (e.g., covidatlas/coronadatascraper#572 - KOR quarantines). For scrapers, we could hardcode these caveats in the scrapers, and perhaps include them in the source output, e.g.:

[
  {
    "county": "Los Angeles County",
    "state": "California",
    "country": "United States",
...
    "url": "http://www.publichealth.lacounty.gov/media/Coronavirus/",
    "cases": 0,
    "deaths": 0,
    "caveats": [
        "some_data_here"
   ],
...
  }
]

Perhaps these assumptions could be rolled up to the higher levels:

    "caveats": [
        "LA, CA: some_data_here",
        "PA: penn. caveats here"
   ]

Why do you need this feature or component?

Publicize assumptions

Notes

For testing/regression, I don't think we'd need to check the caveats field, as it might change over time. One sanity check would be enough.

jzohrab avatar Apr 07 '20 18:04 jzohrab

Yes! This came up also in the discussion of the Panama scraper, because the Panama granularity level is akin to "borroughs" (smaller than cities) and we don't have anyway to store that. So if we call them counties, that detail could go in a field like this.

shaperilio avatar Apr 07 '20 23:04 shaperilio