Rossi
Rossi
Just realized we don't update `OpinionCluster.citation_count` on the `store_recap_citations` function... should we? This may be another way in which that field is not accurate https://github.com/freelawproject/courtlistener/issues/5601
This is a script to merge the dockets in this issue , which should be a single docket. To test it, check the commented lines on how to use `clone_from_cl`,...
The title of the issue represents the smallest part. Most repeated docket numbers are not due to spread, but due to - ingestion of misc documents that are part of...
About opinions with no `html_with_citations` field. The bulk are from 2022, and 2024... Maybe from imports? ```sql courtlistener=> select date_part('year', date_created), count(*) from search_opinion where html_with_citations = '' group by...
The greater part of files with no html-with-citations have `OpinionCluster.source` "U", meaning Harvard; or "G" meaning RECAP. This points to bugs in those ingestion pipelines @quevon24 @flooie you know the...
Currently running `find_citations` for opinions with no `html_with_citations`. Right now, on 2023-01-01 ``` nohup ./manage.py find_citations --no-html-with-citations --filed-after 2023-01-01 --verbosity 3 --disconnect-elastic-signals --queue etl_tasks > /tmp/log_find.txt & ``` For speed...
Currently targeting opinions from 2017 to today `nohup ./manage.py find_citations --no-html-with-citations --filed-after 2017-01-01 --verbosity 3 --disconnect-elastic-signals --disable-citation-count-update --opinions-per-task 100 > /tmp/log_find_2017.txt &` ```sql courtlistener=> select count(*) from search_opinion where html_with_citations...
``` nohup ./manage.py find_citations --no-html-with-citations --filed-after 2000-01-01 --verbosity 3 --disconnect-elastic-signals --disable-citation-count-update --opinions-per-task 100 > /tmp/log_find_2000.txt & ``` ```sql courtlistener=> select count(*) from search_opinion where html_with_citations = ''; count --------- 2700095...
About half a million a day; Should be slightly faster, we started slower and missed some hours in the nights From the first comment with counts / times to the...
Currently running `nohup ./manage.py find_citations --modified-before 2022-01-01 --verbosity 3 --disconnect-elastic-signals --disable-citation-count-update --opinions-per-task 100 > /tmp/log_find_before_2022.txt &` This is re - processing older opinions to update their `html_with_citations` with the newest...