elasticsearch-learning-to-rank icon indicating copy to clipboard operation
elasticsearch-learning-to-rank copied to clipboard

Query-time boosting affects model execution but not feature value logging

Open tmanabe opened this issue 3 years ago • 6 comments

Hi. I would like to ask a question about this plugin and query-time boosting. I am afraid that query-time boosting affects only execution of models but does not affect feature value logging. In my opinion, this behavior may lead to inaccurate modeling. Is this an expected behavior?

A test case to reproduce the behavior: https://github.com/o19s/elasticsearch-learning-to-rank/compare/main...tmanabe:different-feature-values

Thanks!

tmanabe avatar May 10 '21 08:05 tmanabe

This is true, boosting only affects the output score of the model. This is I think what is expected when you boost a query: the query score is multiplied by the boost value.

If the query boost was to affect the feature values then I'm not sure how this would work:

  • depending on the model the output will no longer be what most users will expect output_score != model_score * boost
  • the boost becomes something inherent to the model training, running the model with a different boost than the one used to extract feature values might lead to weird behaviors esp. for decision trees.

I think it's less error prone to have features as independent as possible from their context (parent query boost here) to reduce possible discrepancies between training and runtime.

nomoa avatar May 10 '21 12:05 nomoa

Thanks for your quick reply!

With a debugger, I compared feature values for model execution with ones for feature value logging. As a result, my understanding is that query-time boosting affects feature values themselves (not only model output) and affects at model execution time only. スクリーンショット 2021-05-12 18 04 27 スクリーンショット 2021-05-12 18 04 37

So these two are also my concern:

  • Depending on the model, output_score != model_score * boost
  • Models can be trained with original feature values then executed with boosted feature values

tmanabe avatar May 12 '21 09:05 tmanabe

@tmanabe thanks for digging into this! You're absolutely correct, I wrongfully assumed that the RankerQuery did not propagate the boost but it does as shown in your debugging session. Culprit seems RankerQuery#createWeight at https://github.com/o19s/elasticsearch-learning-to-rank/blob/main/src/main/java/com/o19s/es/ltr/query/RankerQuery.java#L201 where top-level boost is passed to feature queries.

I'd be in favor of forcing the boost to 1 for feature queries and apply the boost a posteriori from the scorer (https://github.com/o19s/elasticsearch-learning-to-rank/blob/main/src/main/java/com/o19s/es/ltr/query/RankerQuery.java#L312).

@worleydl do you have any thoughts on this?

nomoa avatar May 14 '21 06:05 nomoa

For context I think the behavior changed with https://github.com/o19s/elasticsearch-learning-to-rank/commit/b907213a3baba02add3ae89eb3ebcbea881289de#diff-07788001c91b0b5c03be973de2a368900204bab6c6fc6d3255ec34bcf6184c09L239 where we normalized explicitly with a boost set to 1.0F (elastic 6.1.0 upgrade).

nomoa avatar May 14 '21 11:05 nomoa

Thanks for the additional info David. Definitely seems like a regression maybe we can add some additional test cases around it and get the boost setup as you describe?

worleydl avatar May 14 '21 13:05 worleydl

@worleydl coming back to this issue, should we lay out the unit test requirements and tag this as help wanted (or to be developed)?

nathancday avatar Sep 07 '21 19:09 nathancday