photon
photon copied to clipboard
Suboptimal relevance calculation
While doing some tests for the new 1.2 I noticed the following situations where the old was also not optimal:
- potsdam bahnhof -> the best match is second
- bayreuth sportzentrum -> the best match is second
- ~~Stresemannstr. 36, 10963 Berlin -> no match? Should result in something as Stresemannstraße exists in Berlin (even twice ;))~~
potsdam bahnof: both docs have importance 0.275 and all keywords in the doc, see 1 2
bayreuth sportzentrum: same here
stresemannstr: city assignments are wrong for some german cities (kreisfreie städte) in nominatim. including with bug #136 it led to an empty result
When importance is the same, can you reorder the result by the number of words from the query actually showing up in name and city (and possibly state)?
I'm not sure if the following idea helps in all cases: E.g. if you look at 'potsdam bahnhof', then the second match is better because it matches in more fields (name, state/boundary and city) compared to the first match where only name and state do match. And also als @lonvia pointed out it could help to differentiate for the relevance calculation between exact and partial match.
Also personally I find city more important to match then state/boundary, but there might be cases where this is wrong, not sure. E.g. if you search for 'brandenburg bahnhof' it is very unlikely that you mean the state, maybe only if you would've searched the plural 'brandenburg bahnhöfe' ;)
Update: this example was bad as there is only 'brandenburg an der havel' as a city. Well in general I would just prefer a match in city compared to state/boundary
stresemann straße 36 can be found now.
the other cases demand a profound elaboration of the existing elastic search query and structure.
Thanks!