elasticsearch-langdetect
elasticsearch-langdetect copied to clipboard
Can't aggregate ?
Hello,
I've tried creating an aggregation using all of the examples in the README without any luck yet.
If I try to use the stored langdetect field for aggregating, ES tells me the data needs fielddata: true, however, it does not allow me to enable it on the langdetect field since it is not text.
I have also tried to use the lang subfield mentioned in the README, however, this does not yield any results.
Example:
PUT /test
{
"mappings": {
"docs": {
"properties": {
"text": {
"type": "langdetect",
"languages" : [ "en", "de", "fr" ],
"store": true
}
}
}
}
}
PUT /test/docs/1
{
"text" : "Oh, say can you see by the dawn`s early light, What so proudly we hailed at the twilight`s last gleaming?"
}
GET /test/_search
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "text:*"
}
}
]
}
},
"aggregations": {
"language": {
"terms": {"field": "text.lang"}
}
}
}
Response
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "docs",
"_id": "1",
"_score": 1,
"_source": {
"text": "Oh, say can you see by the dawn`s early light, What so proudly we hailed at the twilight`s last gleaming?"
}
}
]
},
"aggregations": {
"language": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": []
}
}
}
Am I missing something or doing this wrong ?
Thanks!
My langdetect plugin creates a new field type and is not a string/text/keyword, which may be the reason that aggregation does not work.
I will have a look into this issue, maybe the idea of a new field type was over-engineered, and I find a simple way to set it to a straightforward string/text/keyword field type.
Maybe adding the possibility to set fielddata ? Or the ability to add a subfield to the language field in order to also store it as a keyword on which to add the aggregations ?