Rossi

Results 280 comments of Rossi

I have been trying this [implementation](https://github.com/python-jsonschema/jsonschema) [(docs)](https://python-jsonschema.readthedocs.io/en/stable/) which seems like a healthy project There is a small sample schema for the scrapers [here](https://github.com/grossir/juriscraper/blob/f95bdfc0fc0f190baa1a245fed6fe3b24c4bbe69/juriscraper/schemas/scraper_schema.py) ``` validation_schema = { "type": "object", "properties":...

This gap can also be seen on [CL search](https://www.courtlistener.com/?q=court_id%3Acalctapp&type=o&order_by=dateFiled%20desc&stat_Unpublished=on&filed_after=10%2F08%2F2016&filed_before=08%2F16%2F2020), with start date 10-08-2016 and end date 08-16-2020

Yes, both `colo` and `coloctapp` are using the new site https://research.coloradojudicial.gov Check #1062

The backscraper PR also allows filling gaps for `nyappterm`, since the same class is used Between June 15th, 2020 and February 2nd, 2023 we have 5 documents in [CL](https://www.courtlistener.com/?q=court_id%3Anyappterm&type=o&order_by=dateFiled%20desc&stat_Precedential=on&filed_after=06%2F15%2F2020&filed_before=02%2F16%2F2023). From...

Commands to fill the gaps ``` docker exec -it cl-django python /opt/courtlistener/manage.py cl_back_scrape_opinions --courts juriscraper.opinions.united_states_backscrapers.state.ny --backscrape --backscrape-start=04/26/2018 --backscrape-end=02/12/2019 ``` ``` docker exec -it cl-django python /opt/courtlistener/manage.py cl_back_scrape_opinions --courts juriscraper.opinions.united_states_backscrapers.state.ny --backscrape...

Yes let's archive it. we have current data on [CL](https://www.courtlistener.com/?q=court_id%3Alactapp&type=o&order_by=dateFiled%20desc&stat_Published=on&stat_Unknown=on) I wasn't able to reproduce the error, seems like random connection errors from the server

I took some time writing this, without complete testing, but it runs and works as a concept. The code can be seen here: https://github.com/freelawproject/juriscraper/compare/main...grossir:juriscraper:new_opinion_site_subclass?expand=1 Basically, it is a new class...

I think `OpinionSiteLinear` sites can be updated to this new base class easily, since it is following some of the same usage conventions (using self.sites to store records; conversion of...

@ERosendo This looks great on the Juriscraper side. I would suggest giving the example files a JSON extension, and formatting them properly so they are readable To understand this PR...

I revisited this issue due to backscraping citations, and I think that we can split this problem in 2 parts: 1) data model changes; 2) row merging / deduplication problem...