elasticsearch-plugin-bundle
elasticsearch-plugin-bundle copied to clipboard
_langdetect endpoint missing in ES 2.2?
With ES 1.7 we used the _langdetect endpoint to verifiy the language of a document prior to indexing it according to the examples from https://github.com/jprante/elasticsearch-langdetect.
Trying the same with ES.2.2 and bundle 2.2.0.1 the example query now returns
curl -XPOST 'localhost:9200/_langdetect?pretty' -d 'Das ist ein Test' { "error" : { "root_cause" : [ { "type" : "invalid_index_name_exception", "reason" : "Invalid index name [langdetect], must not start with ''", "index" : "_langdetect" } ], "type" : "invalid_index_name_exception", "reason" : "Invalid index name [langdetect], must not start with ''", "index" : "_langdetect" }, "status" : 400 }
Is the endpoint still available somewhere?
No comments? Hope my question wasn't too birdbrained... ;) However, if so, I would appreciate a hint on what I am missing...
Sorry, I overlooked the issue.
I released 2.2.0.2 with a fix.
Download link of plugin zip file is
https://github.com/jprante/elasticsearch-plugin-bundle/releases/download/2.2.0.2/elasticsearch-plugin-bundle-2.2.0.2-plugin.zip
Thanks a lot for your response! Installed it right away. Unfortunatlly I get an error no matter if execute from sense or from command line:
curl -XPOST 'localhost:9200/_langdetect?pretty' -d 'Das ist ein Test' { "error" : { "root_cause" : [ { "type" : "illegal_state_exception", "reason" : "failed to find action [org.xbib.elasticsearch.action.langdetect.LangdetectAction@d8b70e11] to execute" } ], "type" : "illegal_state_exception", "reason" : "failed to find action [org.xbib.elasticsearch.action.langdetect.LangdetectAction@d8b70e11] to execute" }, "status" : 500 }
OK, that was the reason why I removed the REST action.... I have to investigate how to solve this class loader issue.
Thanks in advance! IMHO _langdetect REST endpoint is quite an important feature since it allows to check the language prior to indexing. Each document can then be sent to the right index having the appopriate analyzers for that language
Thx for posting the update!! Just found a typo in the install link ./bin/plugin install 'http://search.maven.org/remotecontent?filepath=org/xbib/elasticsearch/plugin/elasticsearch-plugin-bundle/2.2.0.3/elasticsearch-plugin-bundle-2.2.0.3-plugin.zip'
Attaching the right analyzer is a feature where REST endpoint is not for.
In ES 1.x this was possible by assigning an analyzer path. In ES 2.x this was removed. I will implement multi-field name extension with automatically setting language analyzers https://www.elastic.co/guide/en/elasticsearch/guide/current/mixed-lang-fields.html#_analyze_multiple_times
Thanks for finding the typo.
This is probably not the right place to discuss some "best practices" (which I would be interested in) but according to some recommendations around the inet we decided to go for seperate indices for each language such as "myindex_de" and "myindex_en" for example. Therefore we have to detect the language prior to indexing... This way we can do searches on "myindex_*" to get results in multiple languages. And we get around all that trouble with mixed languages