Flowpack.ElasticSearch.ContentRepositoryAdaptor icon indicating copy to clipboard operation
Flowpack.ElasticSearch.ContentRepositoryAdaptor copied to clipboard

Use flattened field type instead of object for neos_fulltext_parts

Open daniellienert opened this issue 4 years ago • 7 comments

The field neos_fulltext_parts stores the fulltext path for every aggregateRoot document with the childnode identifiers as keys. This easily can lead to mapping explosion. Starting with Elasticsearch > 7.3, you can configure:

  properties:
    'neos_fulltext_parts':
      search:
        elasticSearchMapping:
          type: flattened
        indexing: ''

to avoid this.

This should be the default when 7.x is the minimal supported version.

Note: as per https://www.elastic.co/guide/en/elasticsearch/reference/7.5/release-highlights-7.3.0.html#_new_flattened_field_type this type is only available "with the default distribution of Elasticsearch." – whatever that means.

daniellienert avatar Nov 11 '20 07:11 daniellienert

The bad thing is, we can't simply override this in userland code… The default is

    'neos_fulltext_parts':
      search:
        elasticSearchMapping:
          type: object
          enabled: false
        indexing: ''

and when changing the type to flattened, you get this: Mapping definition for [neos_fulltext_parts] has unsupported parameters: [enabled : false]

kdambekalns avatar Feb 04 '21 14:02 kdambekalns

I changed this directly, configuration:show now gives me:

        neos_fulltext_parts:
            search:
                elasticSearchMapping:
                    type: flattened
                indexing: ''

like above. I still get this:

{
    "update": {
        "_index": "neos-fb11fdde869d0a8fcfe00a2fd35c031d",
        "_type": "_doc",
        "_id": "0c079d7c6c2a427f76eaf55340dd28d79ce3ae47",
        "status": 400,
        "error": {
            "type": "illegal_argument_exception",
            "reason": "Limit of total fields [1000] has been exceeded"
        }
    }
}

Debugging shows me: During indexing, an index named like neos-e781f29c8dd927c09735547a848e3459-1612469847 is created, that shows this when fetching the mapping:

"neos_fulltext_parts": {
	"type": "flattened"
}

As soon as documents are actually added, this happens to an index named like neos-e781f29c8dd927c09735547a848e3459 (note the missing suffix), in which the mapping is like this:

"neos_fulltext_parts": {
	"properties": {
		"04b90b2e-fc62-4c31-aa7a-12cbb2c8dc94": {
			"properties": { … }
		},
		…
	}
}

kdambekalns avatar Feb 04 '21 20:02 kdambekalns

The bad thing is, we can't simply override this in userland code… The default is

@kdambekalns: https://github.com/Flowpack/Flowpack.ElasticSearch.ContentRepositoryAdaptor/commit/97aa56fe2b1f30e16b53c766c2ecda954a5cae08 is whats needed to unset ´enabled`. Seems I actually only added that to v8 🙈

daniellienert avatar Feb 04 '21 21:02 daniellienert

As soon as documents are actually added, this happens to an index named like neos-e781f29c8dd927c09735547a848e3459 (note the missing suffix), in which the mapping is like this:

@kdambekalns: There should never be an index named like that. It should be an alias pointing to one of the timestamp suffixed indices. Apparently this index was created on the fly - without the index configuration / mapping applied - before the aliases are added.

daniellienert avatar Feb 04 '21 22:02 daniellienert

Great, that bug about the alias not being an alias is a long-time aquaintance, now that you mention it. With that fixd (manually), indexing indeed preserves the flattened type.

I'll create a PR mimicking https://github.com/Flowpack/Flowpack.ElasticSearch.ContentRepositoryAdaptor/commit/97aa56fe2b1f30e16b53c766c2ecda954a5cae08 for v7 tomorrow.

And look into that "no alias created as expected" issue.

kdambekalns avatar Feb 04 '21 22:02 kdambekalns

I'll create a PR mimicking 97aa56f for v7 tomorrow.

https://github.com/Flowpack/Flowpack.ElasticSearch.ContentRepositoryAdaptor/pull/372

kdambekalns avatar Feb 05 '21 09:02 kdambekalns

Anyone got this working with Opensearch (which does not support flattened)? 🤔

paavo avatar Feb 21 '24 13:02 paavo