Rossi
Rossi
In some cases, we have the opinion (cluster) but we do not have the related citation. I do not know how many neutral reporters are affected, but I have a...
I have identified 2 types of dirty citation data: 1. duplicated citations that match duplicated opinions 2. corrupt citations: the same citations for hundreds of different opinions ### Duplicated citations...
Let's track here our ideas and requirements for a "scraper status page". Sometimes scrapers fail (for many reasons) and such a page would be useful to notice more errors that...
The [source](https://www.mdcourts.gov/cgi-bin/indexlist.pl?court=coa&year=2023&order=bydate&submit=Submit) lists judge names. However, that same cell often contains strings like "Order", "PC Order", "Per Curiam". We ingested those into Courtlistener, as recent as March 2024, as early...
Solves #1019 Updating base class to OpinionSiteLinear makes code shorter and cleaner, also solves the following bug: InsanityException caused by unexpected extra line of text in case name cell: "LEAVE...
Related to #929 Between September 06, 2019 and December 18, 2019 we have [0 documents](https://www.courtlistener.com/?q=court_id%3Afla&type=o&order_by=dateFiled%20asc&stat_Precedential=on&stat_Non-Precedential=on&filed_after=09%2F06%2F2019&filed_before=12%2F18%2F2019). We are missing 35 [documents](https://supremecourt.flcourts.gov/search?enddate=12/18/2019&limit=50&searchtype=opinions&startdate=09/06/2019) To solve this, a dynamic backscraper will be implemented. Also...
Current scraper uses this [endpoint](http://media.ca1.uscourts.gov/cgi-bin/opinions.pl/?FROMDATE=01%2F24%2F2024&TODATE=05%2F01%2F2024&puid=) which shows an opinion published in February 2nd 2024 as the most recent one. This is also the newest opinion from `ca1` that we have...
Solves #1017 Changes: - Target new endpoint - Refactor to OpinionSiteLinear - Implement dynamic backscraper to solve most recent gap
Part of #929 ### sd We have a [gap from](https://www.courtlistener.com/?q=court_id%3Asd&type=o&order_by=dateFiled%20desc&stat_Precedential=on&filed_after=08%2F31%2F2019&filed_before=01%2F03%2F2023) August 30th, 2019 to January 4th, 2023, which should amount to a little over 240 missing documents (totals are not...
Part of #929 Between June 05, 2020 and March 31st, 2022, we have [4 documents](https://www.courtlistener.com/?q=court_id%3Avtsuperct&type=o&order_by=dateFiled%20asc&stat_Precedential=on&stat_Non-Precedential=on&filed_after=06%2F06%2F2020&filed_before=03%2F31%2F2022). We are missing ~175 [documents](https://www.vermontjudiciary.org/opinions-decisions?facet_from_date=06/05/2020&facet_to_date=03/31/2022&f%5B0%5D=document_type%3A94) from civil, criminal, family and environmental courts. Between November 16th,...