backend icon indicating copy to clipboard operation
backend copied to clipboard

duplicate document in solr index

Open hroberts opened this issue 3 years ago • 0 comments

The solr index seems to have duplicate documents in it. Those duplicates are filtered out when document lists are returned but are present when results are merely counted. For the time being, we are using the hll() json.facet function to estimate unique stories_ids within results, but at some point we should figure out how the duplicates are getting into the database even though we are importing everything with overwrite=true.

hroberts avatar Jul 27 '20 17:07 hroberts