elasticsearch
elasticsearch copied to clipboard
Add security index metadata -> metadata_flattened field migration
This PR adds code to migrate the metadata
field to the metadata_flattened
field in the main security index. metadata_flattened
is of type flattened
and already exists to enable queries against metadata
for api keys. After this PR has been merged metadata
will be indexed in the metadata_flattened
field for privilege
, user
, apikey
(already available), role-mappings
and roles
.
To migrate the field the PersistentTasksExecutor is used, which is responsible for executing restartable tasks that can survive disappearance of a coordinating and executor nodes. It also gives us guarantees that only a single instance of a task with the same id will run concurrently.
The migration is triggered by a security index state change and therefore the cluster state will be checked for if the migration has happened or not on every security index state change. If a new security index is created, the migration is skipped and cluster state will be populated with the migration status as completed.
The migration status is stored in IndexMetadata in cluster state as a custom metadata.
After this has been merged, the APIs that write to privilege
, user
, role-mappings
and roles
have been updated to dual-write the metadata
field to metadata_flattened
. This means that we also have to check that the new metadata flattened node feature exists so the new field is not written to a mixed cluster, causing old nodes to crash.
The PR also adds:
- A new node feature:
security.metadata_migrated
to prevent the migration from running in mixed clusters. - A new transport version
ADD_METADATA_FLATTENED_TO_ROLES
to control the serialization of role descriptors. - A new field in the query
user
api allow queries againstmetadata
for verification purposes.
Notes
- I investigated if the status of the migration job (completed or not) could be kept in the
_meta
field in the security index mapping. After some testing, I don't think that is a pattern we want to use, primarily because of how the_meta
field is updated. It's a full replace of the object field, which means that first the mappings has to be read followed by an update/merge with the new properties and then written. This whole process is not atomic, so it's not a good place to track the status of a job due to risk of race conditions. It's also not a great place to keep metadata because of how the mappings are currently handled by the SystemIndexMappingUpdateService, where the mapping is overwritten (including_meta
) with the full mapping from theSystemIndexDescriptor
. If the long term goal is to move over to only using theSystemIndexMappingUpdateService
, it's better to not try to hack around its implementation in my opinion. - Logic could be added to the query metadata apis (query user, query roles) to check if the migration has happened and throw an error if the field is used in a query to prevent confusion for customers running mixed clusters or where the migration failed for some reason. I have not added this to the query users API.
Future Work
- When doing the next major upgrade this approach would allow us to drop the
metadata
field in favour of themetadata_flattened
field. - Add query roles API that uses the new field
- Remove usage of
metadata
entirely.
TODO
- Add BWC test that checks the security index directly for models without query api
- Add unit test for new executor class
- Update user docs to include metadata
- Add and describe the info logs added to track the status of this job.
- Mark metadata as remove in 9.x