Fill `pa`, `pasuperct` and `pacommwct`
Part of #929
To fill the gaps, we will implement a backscraper
pa
Between May 28, 2021 and November 16, 2021 we have 0 documents. There are 1261 documents in that time period, some of which are not opinions
pasuperct
Between May 28, 2021 and November 17, 2021 we have 0 documents. We are missing around 1504 "Non Precedential" documents and 164 Precedential opinions (source)
pacommwct
Between May 28, 2021 and August 23, 2021 we have 0 documents. We are missing 49 precedential opinionsf
I have updated this source to use the API, instead of the RSS Feed, both for the present scraper and for the back scraper
This source returns OpinionClusters from it's API.. It would be an easy candidate to start returning clusters. For example (and more on the example files):
{
"Author": null,
"BoardDocketNumber": null,
"Caption": "In the Interest of: N.E.M., Appeal of: N.E.M. - No. 9 EAP 2023",
"CourtDocketNumber": null,
"CourtType": 3,
"DispositionDate": "2024-03-21T00:00:00",
"Keywords": null,
"UserIdentifier": "E.D. Prothonotary",
"UploadDate": "0001-01-01T00:00:00",
"PostedToday": false,
"Postings": [
{
"Id": 88531,
"AuthorId": "Donohue, Christine",
"OpinionId": 80242,
"FileName": "J-41B-2023mo - 105874033259675150.pdf",
"ProcessedDate": "2024-03-21T00:00:00",
"PostingTypeId": "mo",
"PublicationTypeId": null,
"RenderedDate": "2024-03-21T00:00:00",
"SortOrder": 0,
"FileVersion": 1,
"Author": {
"Id": 0,
"AuthorName": "Justice Christine Donohue",
"AuthorCode": "Donohue, Christine",
"Selectable": true,
"SortOrder": 1430
},
"PostType": {
"Id": 0,
"PostingTypeCode": "mo",
"PostingTypeId": "Majority Opinion",
"SortOrder": null
},
"PublicationType": null
},
{
"Id": 88533,
"AuthorId": "Dougherty, Kevin M.",
"OpinionId": 80242,
"FileName": "J-41B-2023co - 105874033259675223.pdf",
"ProcessedDate": "2024-03-21T00:00:00",
"PostingTypeId": "co",
"PublicationTypeId": null,
"RenderedDate": "2024-03-21T00:00:00",
"SortOrder": 0,
"FileVersion": 1,
"Author": {
"Id": 0,
"AuthorName": "Justice Kevin Dougherty",
"AuthorCode": "Dougherty, Kevin M.",
"Selectable": true,
"SortOrder": 1440
},
"PostType": {
"Id": 0,
"PostingTypeCode": "co",
"PostingTypeId": "Concurring Opinion",
"SortOrder": null
},
"PublicationType": null
}
],
"Id": 80242
},
Commands to fill the gap
docker exec -it cl-django python /opt/courtlistener/manage.py cl_back_scrape_opinions --courts juriscraper.opinions.united_states.state.pa --backscrape-start=05/27/2021 --backscrape-end=11/17/2021
docker exec -it cl-django python /opt/courtlistener/manage.py cl_back_scrape_opinions --courts juriscraper.opinions.united_states.state.pasuperct --backscrape-start=05/27/2021 --backscrape-end=11/18/2021
docker exec -it cl-django python /opt/courtlistener/manage.py cl_back_scrape_opinions --courts juriscraper.opinions.united_states.state.pacommwct --backscrape-start=05/27/2021 --backscrape-end=08/24/2021