Chicago Pedestrian Stops data cannot be loaded
When Chicago Pedestrian stops data is loaded, a 403 Forbidden error is return. The solutions at the below links have already been tried and the robots.txt file for the website appears to be empty.
https://stackoverflow.com/questions/62278538/pd-read-csv-produces-httperror-http-error-403-forbidden https://stackoverflow.com/questions/54540901/why-am-i-getting-a-http-403-error-with-pandas
This seems to work:
url = "https://home.chicagopolice.org/wp-content/uploads/2022-ISR.zip"
storage_options = {'User-Agent': 'Mozilla/5.0'}
df = pd.read_csv(url, storage_options=storage_options)
UPDATE: Tried this again on work computer and it no longer works
This seems to work:
url = "https://home.chicagopolice.org/wp-content/uploads/2022-ISR.zip" storage_options = {'User-Agent': 'Mozilla/5.0'} df = pd.read_csv(url, storage_options=storage_options)
This does not work on my home computer
https://home.chicagopolice.org/robots.txt results in a blank page. I would assume that this means that nothing is disallowed
This might be worth testing: https://stackoverflow.com/questions/16627227/problem-http-error-403-in-python-3-web-scraping
All data has been successfully added except 2019 data.
2019 data has multiple CSV files in it which pandas does not handle
> url = 'https://home.chicagopolice.org/wp-content/uploads/2019-ISR.zip'
> storage_options = {'User-Agent': 'Mozilla/5.0'}
> table = pd.read_csv(url, encoding_errors='surrogateescape', storage_options=storage_options)
Exception has occurred: ValueError
Multiple files found in ZIP file. Only one file per ZIP: ['2019-ISR-Jan-Jun.csv', '2019-ISR-Jul-Dec.csv']