photon
photon copied to clipboard
(WIP) Add custom hyphenation plugin
This PR provides a fix for #42.
@karussell could you give it a try, please? I did not yet run geocoder-tests, but would like to discuss some TODOs:
- photon-es currently is just a sub directory, not a maven module. To proceed, I'd move the current src/es to a new core submodule, WDYT?
- applying the decompounding just to .de. analyzers? Probably applying the decompounder for all languages does no harm, which should be checked against geocoder-tests. Just applying it for .de. analyzers currently would not work, IMHO, as Photon copies only name:de to the language specific collector.de and name.de fields. Many german streets in OSM have just a default name tag and not name:de as well. Possibly deducing the language of default names from the address's country could work(?)
- installing photon-es.jar from maven build instead of comitting the binary
Thanks, will try!
@hbruch and @karussell did you follow up on this any further?
In German there are a lot of street names like "Kölner Straße" which users spell as "Kölnerstraße" and thus do not get any results with photon. I tested the change locally, it seems to fix the issue and I did not notice any regressions.
Are there any plans to merge this in the future?
Hey, I know this stale PR is pretty old at this point. However, I have the same issue with German street names and just tested its changes locally. Aside from a small merge conflict and some outdated Elasticsearch version references, this PR still seems to be able to solve the issue.
Is there any chance that these changes will end up in Photon at some point? @hbruch @karussell @lonvia