Avi Eisenberg

Results 11 comments of Avi Eisenberg

I could use an export of case numbers and precedential status from calctapp for the backscraping. From https://www.courtlistener.com/?stat_Non-Precedential=on&order_by=dateFiled+desc&court=calctapp&type=o&page=2, CL currently has 91,367 precedential cases and 26,335 non-precedential cases. That would...

Actually, the minutes PDFs have the case names in them. But to get them you have to parse the PDF and it's combined with other information like judge name -...

Do you think the case names from the minute PDFs are good enough?

Can you try this: `curl 'https://appellatepublic.kycourts.net/api/api/v1/cases/search?queryString=true&searchFields%5B0%5D.searchType=Starts%20With&searchFields%5B0%5D.operation=%3D&searchFields%5B0%5D.values%5B0%5D=2019-CA-0208&searchFields%5B0%5D.indexFieldName=caseNumber' `

Also, what's the general philosophy on blocked scrapers and using work-arounds? E.g. this site works fine over privateinternetaccess VPN for me, if it is blocked on production could you just...

This site should get both court of appeals and supreme court for ky.

I mean, for something like KY where you'd be pinging them a handful of times a day at most, I don't think it would be unmanageable. I'm curious how hard...

It's trivial to add a proxy to a `requests` call. Looks like you could just add a `PROXY_URL` and `PROXY_PASSWORD` to the settings file, and add a couple of lines...

https://kcoj.kycourts.net/Content/docs/TermsOfUse2-4-13.pdf >The Kentucky Court of Justice does not allow any spiders, data mining, or data scraping of this Site. Use of software intended to discover and extract data from this...

Well, the scraper needs to be rewritten to use the new URLs, and maybe we should scale it up slowly to make sure it doesn't trigger anything again. Then we...