Eric Mill

Results 227 issues of Eric Mill

We have phone, website, and office/address for House committees, but not Senate committees. The website, at least, seems present on Senate.gov. Example: http://www.senate.gov/general/committee_membership/committee_memberships_SSBK.htm

The channel IDs don't obey the contract of being able to say `youtube.com/[ID]` to construct a valid URL. They only work at `youtube.com/channel/[ID]`. I don't know if YouTube worked this...

The Senate Periodical Press Gallery has revamped itself, and now offers a bunch of new resources: http://www.periodicalpress.senate.gov/ - What they've always offered: [to-the-minute Senate floor updates](http://www.periodicalpress.senate.gov/), though without any timestamps....

The House' History site has a page dedicated to them: http://history.house.gov/Records-and-Research/FAQs/Committee-Names/ Thanks to @danielschuman for pointing this out.

As reported by anyone integrated the data. The logic from an integrator should be - if it's closed, re-open it. If it's open, do nothing (I imagine Github will silently...

This seems handy: https://github.com/fouber/page-monitor

Though there's no bulk data and it doesn't cover every IG, oversight.gov at least has consistent HTML for many IGs. Though, we also may be interested in identifying differences or...

Even if the USPS or DHS IGs don't have them, at least set up a process where if it does detect any, it emails the admin.

Data Improvement

I'm not sure the best path for detection of reports that need OCRing (perhaps through a flag set by the scraper), but we should have `tesseract` for OCRing of some...

Not the GAO IG, but the [GAO](http://www.gao.gov/) itself, who publishes an amazing number of excellent reports. There are four interesting datasets, with two known existing scrapers: - Reports, for which...