opensearchserver
opensearchserver copied to clipboard
Automatic Language detection
Hi,
I've put the URL [(http://www.sicris.si/public/jqm/memo.aspx?lang=slv&opdescr=faq&source=evaluation.inc&opt=3&subopt=7)] into Manual crawl. Automatic language detection stated: Lang: cs It should be sl - Slovenian and not Czech.
Found in your FAQ an article: How the lang attribute of webpages gets detected
So the fallback with content detection is not working properly. We'll try to solve this with language params on our test site and see how this works out.
One option is to put language param in HTML documents. So than it detects SL. But in results it returns English as first result.
I think I could solve this with language param in query. But it does not contain Slovenian.
Is there a possibility to add this?
Let me turn my question around. Would you add Slovene language to your "ngram detection"?
In issue #1822 you gave me once instructions how to add a slovene lemantizer. This I did. But how can I use it and if I'm right, this is not connected with the "ngram detection"?