Implement rank-score-drop-limit for any ranking phase
Currently, only first-phase supports rank-score-drop-limit. In many contexts, you can discover that the retrieved document was irrelevant after the second-phase expression or even after the global phase after having spent more resources on the problem, and we should support drop limits for both the second-phase and global phase. For example, transformer-based cross-encoders have a pretty good interpretable score range.
Hi @jobergum , Good to see this enhancement ticket. Currently, I am using binary quantisation (https://pyvespa.readthedocs.io/en/latest/examples/multilingual-multi-vector-reps-with-cohere-cloud.html) for my use case. However, as the cosine similarity is calculated at the second ranking phase and rank-score-drop-limit is NOT available in the second phase, a huge list of hits are being returned from the vespa layer. Most of these don't meet my required similarity threshold and so, I drop these in my application layer. rank-score-drop-limit support in the second layer will help here. Until this is available, is there a way to prune my hits on the ranking layer itself, somehow?
Thanks.
Thank you @xansrnitu!
You can prune in a custom searcher implementation . This ticket concerns being able to prune without having to write a custom searcher implementation in both the second-phase and global-phase. There is no workarounds except a custom searcher implementation.
rank-score-drop-limit for second phase is available in Vespa 8.354.46. Documentation: https://docs.vespa.ai/en/reference/schema-reference.html#secondphase-rank-score-drop-limit System test: https://github.com/vespa-engine/system-test/pull/4040, https://github.com/vespa-engine/system-test/pull/4051
I have created https://github.com/vespa-engine/vespa/issues/31619 for supporting rank-score-drop-limit in global phase. Closing this issue.