warn-scraper
warn-scraper copied to clipboard
Add MN scrape
https://mn.gov/deed/programs-services/dislocated-worker/reports/
PDFs
When I try to scrape using utils.get_urls or requests.get(url) or requests.get(url, verify=False), the website demands that I prove that I'm not a bot: this contains the page it displays. The other error I get is "max retries exceeded ... unable to get local issuer certificate".
The urls of each pdf ends with some seemingly random numbers, so getting the urls of each individual pdf seems impossible. Is there a way to bypass this?