juriscraper icon indicating copy to clipboard operation
juriscraper copied to clipboard

Fill `ky` gap

Open grossir opened this issue 1 year ago • 3 comments

Part of #929

Between December 28th, 2017 and May 16th, 2020 , there are 6 documents in CL. There should be 626 according to the source

grossir avatar Mar 27 '24 18:03 grossir

The current scraper will only yield the most recent opinions, and there is no way to filter by date on that primary endpoint

However, there is a Opinion Search portal on the same domain as the secondary endpoint.

On the backend, it calls the opinions API. Comparing this portal with the primary endpoint, it seems to have the same data, and allows querying by date. As in the current scraper, another request will be needed to get the case name

I think it will be necessary (and a good opportunity for refactoring) to change the whole scraper to use the opinions API behind the Opinion Search portal

grossir avatar Mar 28 '24 15:03 grossir

Commands to fill the gaps:

docker exec -it cl-django python /opt/courtlistener/manage.py cl_back_scrape_opinions --courts juriscraper.opinions.united_states.state.ky --backscrape-start=12/27/2017 --backscrape-end=05/17/2020 --verbosity 3

grossir avatar May 08 '24 15:05 grossir

New gap from broken scraper https://github.com/freelawproject/juriscraper/issues/1151

Filed After: 2024-06-14 › Filed Before: 2024-08-21 0 opinions in CL

manage.py cl_back_scrape_opinions --courts juriscraper.opinions.united_states.state.ky --backscrape-start=2024/06/13 --backscrape-end=2024/08/22 --verbosity 3

grossir avatar Sep 04 '24 02:09 grossir