scrapers icon indicating copy to clipboard operation
scrapers copied to clipboard

Code relating to scraping public police data.

Results 36 scrapers issues
Sort by recently updated
recently updated
newest added

Let's look at the scrapers we have and the fields they scrape, to see what we can learn from them. Each scraper has a fields.txt.

good first issue

## The task: - [ ] Represent these requirements in the scrapers readme or template as appropriate - [ ] Represent them by creating an example scraper that meets the...

> If we could have a subpage to test the scrapers on, that'd be great. Basically two separate pages, both having a pdf with the same name, but different data.

### Context https://github.com/openpolicedata/openpolicedata OpenPoliceData is a Python library which serves as an access point to hundreds of valuable datasets. We've added them to our database before, but it's been a...

lists of data sources

### Context existing case search: https://ujsportal.pacourts.us/CaseSearch endpoint: `only sharing with engaged volunteers` > The overall goal is to create infrastructure to help people answer questions using this complex data source...

good first issue
python
scraper-request

### Context Related to data source request `102` Pennsylvania publishes municipal, county, and state budgets. It's possible to find individual municipal budgets, which include police budgets, but cumbersome to get...

help wanted
scraper-request

### Context A researcher/journalist made this data request: > We are looking to scrape records posted by California agencies under SB 1421 and SB 16, including pdfs, audio, video and...

scraper-request

These are some of the most common data portals. We should have a scraper template or utility for each one, likely slightly different in each case, which helps people collect...

good first issue

### Context The [CityProtect scraper](https://github.com/Police-Data-Accessibility-Project/scrapers/blob/main/scrapers_library/data_portals/cityprotect/Cityprotect_base_scraper.py) could use a rework. It works well enough in what it does but is outdated and the instructions in the README are confusing ### Requirements...