Fill `afcca` gap
Part of #929
Between February 1st, 2021 and February 1st 2023, we have 0 documents in CL. We are missing more than 250 documents from 2021 and 2022 (198)
Command to fill the gap:
docker exec -it cl-django python /opt/courtlistener/manage.py cl_back_scrape_opinions --courts juriscraper.opinions.united_states.federal_special.afcca --backscrape-start=2021 --backscrape-end=2023
After Ramiro ran the backscrape we have a single missing document for 2021, the one from 7 Apr 2021 I got the hash manually, and again, the hash already existed in Courtlistener, the document was duplicated.
No missing docs in 2022
For 2023 I got the following missmatches:
Error in 2023-06-27: 2 in site v 3 in db
Error in 2023-04-10: 1 in site v 2 in db
Error in 2023-02-03: 4 in site v 2 in db
Error in 2023-01-26: 3 in site v 2 in db
For 2023-01-26, there 1 and 2 have the same document / hash
For 2023-02-03, 3 of the opinions have the same hash
For 2023-04-10 and 2023-06-27, where the are more clusters in the db than in the site, it seems that opinions previously published on the site have since disspeared. For 2023-04-10, there are two: 1, 2. US vs Lara no longer appears on the source in that date, but it is referenced inside a more recent opinion:
On 10 April 2023, we issued an unpublished opinion where we found that Appellant’s pleas of guilty were not knowing, intelligent acts done with sufficient awareness of the relevant circumstances and likely consequences.
Knowing why the counts don't match, I am closing this issue as completed