warn-scraper
warn-scraper copied to clipboard
OH missing earliest years from PDF
The Ohio scraper has been rebuilt and most of the archives were consolidated into a single CSV for download.
However, the CSV that Big Local News had been hosting contained badly parsed data from the PDFs of 2015 and 2016, containing a bunch of junk characters. We could use someone to parse out the two PDFs into a CSV format so we can get them added to our archival data.
The original PDFs are included in the ZIP, as is the then-consolidated snapshot of the CSV:
https://storage.googleapis.com/bln-data-public/warn-layoffs/oh_2015-2022.zip
The current scraper is grabbing 2017-2022 from a CSV similar to the one that's in the ZIP file here, other than the 2015, 2016, and 2023 data have been purged from it.