Flowpack.ElasticSearch.ContentRepositoryAdaptor
Flowpack.ElasticSearch.ContentRepositoryAdaptor copied to clipboard
Use flattened field type instead of object for neos_fulltext_parts
The field neos_fulltext_parts stores the fulltext path for every aggregateRoot document with the childnode identifiers as keys. This easily can lead to mapping explosion. Starting with Elasticsearch > 7.3, you can configure:
properties:
'neos_fulltext_parts':
search:
elasticSearchMapping:
type: flattened
indexing: ''
to avoid this.
This should be the default when 7.x is the minimal supported version.
Note: as per https://www.elastic.co/guide/en/elasticsearch/reference/7.5/release-highlights-7.3.0.html#_new_flattened_field_type this type is only available "with the default distribution of Elasticsearch." – whatever that means.
The bad thing is, we can't simply override this in userland code… The default is
'neos_fulltext_parts':
search:
elasticSearchMapping:
type: object
enabled: false
indexing: ''
and when changing the type to flattened
, you get this: Mapping definition for [neos_fulltext_parts] has unsupported parameters: [enabled : false]
I changed this directly, configuration:show
now gives me:
neos_fulltext_parts:
search:
elasticSearchMapping:
type: flattened
indexing: ''
like above. I still get this:
{
"update": {
"_index": "neos-fb11fdde869d0a8fcfe00a2fd35c031d",
"_type": "_doc",
"_id": "0c079d7c6c2a427f76eaf55340dd28d79ce3ae47",
"status": 400,
"error": {
"type": "illegal_argument_exception",
"reason": "Limit of total fields [1000] has been exceeded"
}
}
}
Debugging shows me: During indexing, an index named like neos-e781f29c8dd927c09735547a848e3459-1612469847
is created, that shows this when fetching the mapping:
"neos_fulltext_parts": {
"type": "flattened"
}
As soon as documents are actually added, this happens to an index named like neos-e781f29c8dd927c09735547a848e3459
(note the missing suffix), in which the mapping is like this:
"neos_fulltext_parts": {
"properties": {
"04b90b2e-fc62-4c31-aa7a-12cbb2c8dc94": {
"properties": { … }
},
…
}
}
The bad thing is, we can't simply override this in userland code… The default is
@kdambekalns: https://github.com/Flowpack/Flowpack.ElasticSearch.ContentRepositoryAdaptor/commit/97aa56fe2b1f30e16b53c766c2ecda954a5cae08 is whats needed to unset ´enabled`. Seems I actually only added that to v8 🙈
As soon as documents are actually added, this happens to an index named like neos-e781f29c8dd927c09735547a848e3459 (note the missing suffix), in which the mapping is like this:
@kdambekalns: There should never be an index named like that. It should be an alias pointing to one of the timestamp suffixed indices. Apparently this index was created on the fly - without the index configuration / mapping applied - before the aliases are added.
Great, that bug about the alias not being an alias is a long-time aquaintance, now that you mention it. With that fixd (manually), indexing indeed preserves the flattened
type.
I'll create a PR mimicking https://github.com/Flowpack/Flowpack.ElasticSearch.ContentRepositoryAdaptor/commit/97aa56fe2b1f30e16b53c766c2ecda954a5cae08 for v7 tomorrow.
And look into that "no alias created as expected" issue.
I'll create a PR mimicking 97aa56f for v7 tomorrow.
https://github.com/Flowpack/Flowpack.ElasticSearch.ContentRepositoryAdaptor/pull/372
Anyone got this working with Opensearch (which does not support flattened)? 🤔