openlibrary icon indicating copy to clipboard operation
openlibrary copied to clipboard

Fix: Improve Solr query stability for languages and subjects

Open saifxyzyz opened this issue 3 weeks ago • 1 comments

Partially Closes #11546

[Fix]

Technical

Changes:

  1. openlibrary/plugins/worksearch/languages.py: Set _pass_time_allowed=True to the Solr query in get_all_language_counts, this makes Solr return the partial results it has collected so far instead of crashing.
  2. openlibrary/plugins/worksearch/subjects.py: Changed the publish_year facet limit from -1 to 2000 in SubjectEngine, this makes sure Solr returns the top 2000 most common years which covers up all legitimate historical years while filtering out dirty data like typos or invalid years which would cause performance issues

Testing

Ran automated tests (openlibrary/plugins/worksearch/tests/test_worksearch.py) Created a custom script (openlibrary/plugins/worksearch/tests/test_verification.py) to mock solr and verify that the new parameters are correctly passed

  1. Timeout handling: Simulated a timeout returning partialResults: True and confirmed that get_all_language_counts returns the available data instead of raising an exception.
  2. Parameter Verification: Confirmed that _pass_time_allowed=True is explicitly passed in the Solr query parameters.
  3. Facet Limit: Confirmed that the publish_year facet request includes limit: 2000.

Screenshot

Stakeholders

saifxyzyz avatar Dec 05 '25 14:12 saifxyzyz

I've been trying to get the python_tests to pass but my test_verification.py file makes it fail

saifxyzyz avatar Dec 05 '25 20:12 saifxyzyz