luceneutil
luceneutil copied to clipboard
Nightly tasks should never have more than 5 queries in each category
@jpountz noticed that some of the taxo facets nightly tasks jumped surprisingly when we added the new count(*)
tasks.
Digging, I realized that the nightly benchmarks randomness had shifted when we added the new tasks (since the benchy shuffles the incoming tasks then picks top N (= 5 for the nightly benchy in particular)), and because some task categories in tasks/wikinightly.tasks
have more than 5 unique queries.
This is a sneaky longstanding bug, leading us to believe there were performance changes when in fact the specific queries being executed had changed, making results incomparable. This has likely affected us a number of times in the past, drawing false conclusions.
I plan to:
- Fix
nightlyBench.py
to check that no more than 5 unique queries are present under each task - Make a one-time change to
wikinightly.tasks
to try to "get back" to the specific tasks we had executed previous, to try to undo the false performance change.
Thank you @jpountz for noticing the original WTF!
Well, it's not only the taxo facets tasks that are subject to random-seed-shift risks:
Traceback (most recent call last):
File "/l/util.nightly/src/python/nightlyBench.py", line 1857, in <module>
validate_nightly_task_count(f'{constants.BENCH_BASE_DIR}/tasks/wikinightly.tasks', COUNTS_PER_CAT)
File "/l/util.nightly/src/python/nightlyBench.py", line 258, in validate_nightly_task_count
raise RuntimeError(f'nightly tasks file {tasks_file} must have at most {max_count} tasks in each category, but saw {len(tasks)}:\n {tasks_str}')
RuntimeError: nightly tasks file /l/util.nightly/tasks/wikinightly.tasks must have at most 5 tasks in each category, but saw 9:
vector//golf
vector//publisher backstory
vector//many foundation
vector//many geografia
vector//http
vector//interviews
vector//year work
vector//such 2007
vector//this school
Worse, when I peeked in the logs to try to pick which 5 vector searches I should pick/disambiguate to going forward, it's not easy to do so since the KnnFloatVectorQuery
's toString
is just a vector :) And only its first dimension no less:
TASK: cat=VectorSearch q=KnnFloatVectorQuery:vector[0.02625591,...][100] s=null group=null hits=100 facets=[]
For now I'll just disambiguate to the first 5 vector queries from the existing ones:
VectorSearch: vector//publisher backstory # freq=194856 freq=148
VectorSearch: vector//many geografia # freq=99550 freq=104
VectorSearch: vector//many foundation # freq=99550 freq=10894
VectorSearch: vector//this school # freq=238551 freq=29912
VectorSearch: vector//such 2007 # freq=111526 freq=90200 1.2
VectorSearch: vector//year work # freq=175324 freq=102732 1.7
VectorSearch: vector//interviews # freq=31768
VectorSearch: vector//golf # freq=31760
VectorSearch: vector//http # freq=389790
Maybe there is a way to stuff some human readable string into KnnFloatVectorQuery
that pops out in its toString()
method to help we humans that need to otherwise look only at vectors? @msokolov?
OK, besides VectorSearch
category, only the taxo facets category was also unstable/non-deterministic:
OrHighMedDayTaxoFacets
AndHighMedDayTaxoFacets
AndHighHighDayTaxoFacets
MedTermDayTaxoFacets
I was able to disambiguate these tasks to their pre-2023/07/28 tasks.
I'll kick off another one-off nightly benchy. Let's see if these taxo facet tasks get back to "normal" ish.
Worse, when I peeked in the logs to try to pick which 5 vector searches I should pick/disambiguate to going forward, it's not easy to do so since the KnnFloatVectorQuery's toString is just a vector :) And only its first dimension no less
I'll open a Lucene issue for this ... we humans still need to be able to read these things :)