staticSearch icon indicating copy to clipboard operation
staticSearch copied to clipboard

Interaction of weighting and phrasal search

Open martindholmes opened this issue 11 months ago • 1 comments

This is not necessarily a bug or an FR, more a prompt for us to discuss and decide on a policy regarding the interaction of weighting and phrasal searching.

If you search for a single word, e.g. "wholesale", without quotation marks, and you get a hit which is in a weighted context, the hit will score based on the weighting (now that the weighting bug has been fixed). However, if you search for the phrase "Wholesale dealers" and get a hit in exactly the same context, the score the hit gets will not take account of the context weighting. This may perhaps be fine; when you do phrasal searching, you're typically seeking something very specific and the number of hit documents is expected to be very low anyway, so weighting is less important.

On the other hand, if you're searching for a phrase which occurs in many places, you might well want to benefit from weighting to bring hits in significant contexts to the top of the list.

On line 1674 of StaticSearch.js, we appear to assign a standard weight of 2 to any phrasal hit. However, we could take the weight directly from the context at that point. Or we could take the greater of the two, to ensure that phrases still get higher than the default weighting. Thoughts?

martindholmes avatar Mar 08 '24 22:03 martindholmes

Branch weight_context_fix is dealing with this and a related problem with weighting.

martindholmes avatar Mar 14 '24 16:03 martindholmes

This seems only half-thought-through so far, so I'm pushing it to 2.1.

martindholmes avatar Oct 07 '24 23:10 martindholmes