photon icon indicating copy to clipboard operation
photon copied to clipboard

Suboptimal relevance calculation

Open karussell opened this issue 10 years ago • 5 comments

While doing some tests for the new 1.2 I noticed the following situations where the old was also not optimal:

karussell avatar Jan 16 '15 16:01 karussell

potsdam bahnof: both docs have importance 0.275 and all keywords in the doc, see 1 2

bayreuth sportzentrum: same here

stresemannstr: city assignments are wrong for some german cities (kreisfreie städte) in nominatim. including with bug #136 it led to an empty result

christophlingg avatar Jan 19 '15 16:01 christophlingg

When importance is the same, can you reorder the result by the number of words from the query actually showing up in name and city (and possibly state)?

lonvia avatar Jan 21 '15 19:01 lonvia

I'm not sure if the following idea helps in all cases: E.g. if you look at 'potsdam bahnhof', then the second match is better because it matches in more fields (name, state/boundary and city) compared to the first match where only name and state do match. And also als @lonvia pointed out it could help to differentiate for the relevance calculation between exact and partial match.

Also personally I find city more important to match then state/boundary, but there might be cases where this is wrong, not sure. E.g. if you search for 'brandenburg bahnhof' it is very unlikely that you mean the state, maybe only if you would've searched the plural 'brandenburg bahnhöfe' ;)

Update: this example was bad as there is only 'brandenburg an der havel' as a city. Well in general I would just prefer a match in city compared to state/boundary

karussell avatar Jan 21 '15 20:01 karussell

stresemann straße 36 can be found now.

the other cases demand a profound elaboration of the existing elastic search query and structure.

christophlingg avatar Feb 10 '15 13:02 christophlingg

Thanks!

karussell avatar Feb 10 '15 13:02 karussell