warn-scraper
                                
                                 warn-scraper copied to clipboard
                                
                                    warn-scraper copied to clipboard
                            
                            
                            
                        Add scrapes for all remaining states
For each area without a scraper we should make a ticket and do the following:
- Briefly write-up on whether multiple sources of data exist, and why one may be preferable over another
- Determine whether the site is part of a broader platform (e.g. #126)
- Propose a scraping strategy
The areas below do not currently have a scraper:
- [ ] Arkansas
- [x] Colorado #65
- [x] Georgia #63
- [x] Hawaii #371
- [x] Idaho #82
- [x] Illinois #81
- [x] Kentucky #80
- [x] Louisiana #79
- [ ] Massachusetts #78
- [x] Michigan #372
- [ ] Minnesota #75
- [ ] Mississippi #373
- [ ] Nevada #237
- [ ] New Hampshire
- [x] New Mexico #73
- [ ] North Carolina #74
- [ ] North Dakota
- [ ] Pennsylvania #374
- [x] South Carolina #69
- [x] Tennessee #192
- [ ] West Virginia #375
- [ ] Wyoming
- [ ] American Samoa
- [ ] Guam
- [ ] Northern Mariana Islands
- [ ] Puerto Rico
- [ ] Virgin Islands
Added issues for:
- HI #371
- MI #372
- MS #373
- PA #374
- WV #375
Found one for:
- NV #237
Ruled out scraping in the remainder of U.S. states based on prior research. Haven't tackled territories.
Thanks @chriszs - I've added the tix you created/dug up to the main body of this Issue. Based on your prior work tackling WARN, should we flag remaining states (Arkansas, New Hampshire, North Dakota and Wyoming) as not offering data or as non-scraprable (i.e. they have no data on web but we could get it through a public records request)?
Arkansas and Wyoming weren't obtainable via FOIA. North Dakota was. New Hampshire unclear. But worth revisiting just to be sure.
A couple additional ways to think about completeness:
- Population - E.g. notices in states accounting for 9X% of the population. This is I believe how we handled only having ~47 states in a story. This is why I started #425, #427 and reopened #358, since that's the next largest state with no coverage. By that metric, after comes #430 and #374.
- Years - Looks like you can get historical data back to 2015 in most states without too much trouble. So, I've opened #431, #432, #433 and added a comment to #72.
The combination gives you seven years of data over states with the vast majority of people, which is pretty good and is only going to get better.
That is only somewhat leavened by Bloomberg's look at completeness compared to jobless claim stats, which found WARN (and the related state laws that seem to be accounting for some of the notices) only show a part of the overall picture in years with lots of layoffs (5% in a couple key states). GAO found a similar thing in an audit featured in the piece. Important caveat, but probably out of our control.