api icon indicating copy to clipboard operation
api copied to clipboard

Problems with pelias/api commit "feat(confidence): Add support for new query names"?

Open WaughWaugh opened this issue 5 years ago • 3 comments

Hi @orangejulius ,

I believe that there is a problem with your commit for "feat(confidence): Add support for new query names". In middleware/confidenceScoreFallback.js: line 28, I see 2 potential problems:-

  • Shouldn't res.meta.queryType be res.meta.query_type?
  • Shouldn't 'search_original' be present in the include list along with ''search_fallback', 'address_search_with_ids' and 'structured'?

I would be happy to get your feedback regarding this.

WaughWaugh avatar Apr 17 '19 13:04 WaughWaugh

Hi @WaughWaugh,

I believe everything is correct here. Yes, it's confusing that some places in the code use queryType, and others use query_type, but we have enough unit tests. And as https://github.com/pelias/api/pull/1224 taught us, the code breaks if the query type isn't passed along properly.

It's also confusing that search_original is not a query type in the list of queries, but as that query type has its own confidence score calculation, that is definitely correct.

What behavior are you observing that leads you to believe there's an issue?

orangejulius avatar Apr 23 '19 16:04 orangejulius

Thanks for getting back to me, @orangejulius.

I'm facing an issue with this address:

1328 COUNTRY RIDGE, INDIANAPOLIS, IN, 46234

Now, previously I was using pelias/api version 3.35.0 and was getting a match with a confidence level of 0.919. But after I upgraded to pelias/api version 3.60.0, I saw the confidence level drop to 0.8.

After debugging, I think the issue lies in this piece of code from middleware/confidenceScoreFallback.js:line 28,

if (['search_fallback', 'address_search_with_ids', 'structured'].includes(res.meta.queryType)) {
    return next();
  }

Here are my findings:-

  • This search query runs through your processing pipeline as defined by your router and gets processed by your search_original query, which returns the renderedQuery.type as 'search_original'.
  • In your controller/search.js, you are using the renderedQuery.type property in 2 places,
    • line 62, you are setting the queryType of the message object with this above type, but message is used only for logging purposes.
    • line 100, you are setting res.meta.query_type with this above type.
  • Now, in your middleware/confidenceScore.js::computeScores(), you are checking against this res.meta.query_type, which is correctly set by the code mentioned above.
  • But in middleware/confidenceScoreFallback.js::computeScore(), you are checking against res.meta.queryType - which is always set to undefined, as this is not set anywhere.

This is Issue 1. For unit tests, if you added a mock for computeConfidenceScore function, you can see that it will be called for search_fallback'or structured query types, where it is clear from your code that the intention is to skip this confidence score calculation for those query type.

Next, in the code if (['search_fallback', 'address_search_with_ids', 'structured'].includes(res.meta.queryType)), one of the type is address_search_with_ids whereas the query_type set in query/address_search_using_ids.js, line 196 is address_search_using_ids. This is Issue 2 - a typo.

Lastly, even if we omit these errors, for query type search_original, both of these confidence score calculators are getting executed. First, middleware/confidenceScore,js::computeScore() is getting executed which is setting a confidence score of 0.919 for the above address, but after that middleware/confidenceScoreFallback.js::computeScore() is getting called, which is resetting the score back to 0.8.

This is what is causing the difference in score from the earlier version of pelias/api.

In the earlier version (3.35.0), the logic was to execute confidenceScore.js:computeScore() if query_type was original, and confidenceScoreFallback.js:computeScore() would execute if query_type was fallback. Hence, for original query type, only confidenceScore.js:computeScore() would be executed only, setting the confidence score to 0.919.

This is Issue 3.

Please tell me what you think about these observations. Are these actual issues or am I missing something here?

WaughWaugh avatar Apr 24 '19 10:04 WaughWaugh

Hey @WaughWaugh, I think you're right, thanks for taking the time to dive into this. Since you seem to have a good grasp of the problem, can you submit a pull request to fix it? Thanks!

orangejulius avatar Apr 29 '19 16:04 orangejulius