Kevin Ramirez issues

Results 50 issues of


                                            Kevin Ramirez

Update Thompson's Unreported Cases (TN) regex

Add correct regex to identify it as nominative reporter

Error when opening a csv file named: all.csv

If you try to open a csv file named all.csv you will get the following error: ![image](https://github.com/user-attachments/assets/a1428e56-b900-4f99-a7a9-8dc32d4efa81)

Add management command to import manually collected court datasets

This new command will allow us to import opinions from manually collected files, originally intended for this issue: https://github.com/freelawproject/courtlistener/issues/1958 With the data already extracted from the nhd PDFs, we can...

Use with_best_text() to retrieve Opinion text in the frontend

- This change implements with_best_text() in opinion_tabs_content.html to optimize opinion text retrieval. - It also adds an extra value named _original_text_source_ when using with_best_text() because in some cases we need...

Incorrect author_str values in opinions

This problem was found when working on the parent issue I was able to identify 2660 opinions with incorrect author_str values, in some cases the text is incorrect, in other...

fix(docker compose): Required Dev Credential Handling Update

Due to django-storages, the library no longer falls back to `AWS_ACCESS_KEY_ID` or `AWS_SECRET_ACCESS_KEY` when empty strings are provided in settings for dev environment resulting in this error: `An error occurred...

Add management command to import manually collected court datasets

We need a Django management command that allows us to import court opinions collected manually or semi-automatically (e.g. via local runs of Juriscraper). This will serve as an intermediate solution...

Update code to extract extension and mimetype

- Improve extension and MIME extraction when Magika fails on certain files - Add magic and other fallbacks to handle tricky formats - Strip metadata first to avoid detection bias...

PDF misclassified as .ai due to embedded adobe metadata when inferring extension

A valid PDF ([2025_33502.pdf](https://nycourts.gov/reporter/pdfs/2025/2025_33502.pdf)) was misclassified as an Adobe Illustrator (ai) file. The file opens normally and starts with %PDF-1.6. ``` head -c 500 /home/quevon24/PycharmProjects/pythonProjects/2025_33502.pdf|xxd 00000000: 2550 4446 2d31 2e36...

Text extraction fails for PDFs missing startxref/trailer

Text extraction microservice fails on some PDFs because pdftotext rejects them with errors like: ``` Syntax Error: Couldn't find trailer dictionary Syntax Error: Couldn't read xref table ``` These PDFs...