cross-cluster-replication
cross-cluster-replication copied to clipboard
[BUG] Lots of 'Metadata for system indices doesn't exist' errors in logs
What is the bug?
System indices are skipped during replication but there are a lot of corresponding (Metadata for .opendistro_security doesn't exist
) errors in OpenSearch logs.
How can one reproduce the bug? Steps to reproduce the behavior:
- Run
autofollow
for all indices:
curl -XPOST -k -H 'Content-Type: application/json' 'https://localhost:9200/_plugins/_replication/_autofollow' -d '
{
"leader_alias": "leader-cluster",
"pattern": "*",
"name": "replication",
"use_roles": {
"leader_cluster_role": "all_access",
"follower_cluster_role": "all_access"
}
}'
- Check OpenSearch logs:
[2023-05-19T14:40:54,107][ERROR][o.o.r.m.ReplicationMetadataManager] [opensearch-0] Encountered exception -
org.opensearch.ResourceNotFoundException: Metadata for .opendistro_security doesn't exist
at org.opensearch.replication.metadata.store.ReplicationMetadataStore.getMetadata(ReplicationMetadataStore.kt:146) ~[opensearch-cross-cluster-replication-2.4.1.0.jar:2.4.1.0]
at org.opensearch.replication.metadata.store.ReplicationMetadataStore$getMetadata$1.invokeSuspend(ReplicationMetadataStore.kt) ~[opensearch-cross-cluster-replication-2.4.1.0.jar:2.4.1.0]
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33) [kotlin-stdlib-1.6.0.jar:1.6.0-release-798(1.6.0)]
at kotlinx.coroutines.UndispatchedCoroutine.afterResume(CoroutineContext.kt:147) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
at kotlinx.coroutines.AbstractCoroutine.resumeWith(AbstractCoroutine.kt:102) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46) [kotlin-stdlib-1.6.0.jar:1.6.0-release-798(1.6.0)]
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
[2023-05-19T14:40:54,107][ERROR][o.o.r.a.s.TransportReplicationStatusAction] [opensearch-0] got ResourceNotFoundException while querying for status
org.opensearch.ResourceNotFoundException: Metadata for .opendistro_security doesn't exist
at org.opensearch.replication.metadata.store.ReplicationMetadataStore.getMetadata(ReplicationMetadataStore.kt:146) ~[opensearch-cross-cluster-replication-2.4.1.0.jar:2.4.1.0]
at org.opensearch.replication.metadata.store.ReplicationMetadataStore$getMetadata$1.invokeSuspend(ReplicationMetadataStore.kt) ~[opensearch-cross-cluster-replication-2.4.1.0.jar:2.4.1.0]
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33) [kotlin-stdlib-1.6.0.jar:1.6.0-release-798(1.6.0)]
at kotlinx.coroutines.UndispatchedCoroutine.afterResume(CoroutineContext.kt:147) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
at kotlinx.coroutines.AbstractCoroutine.resumeWith(AbstractCoroutine.kt:102) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:46) [kotlin-stdlib-1.6.0.jar:1.6.0-release-798(1.6.0)]
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:106) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665) [kotlinx-coroutines-core-jvm-1.6.0.jar:?]
What is the expected behavior?
There is no ReplicationMetadataStore
error for indices that are not replicated.
What is your host/environment?
- OpenSearch Version - 2.4.1
This error is printed when the replication status API is invoked. Are you running some job which queries for the replication status for all the indices? If not, can you describe the simulation setup - have you created leader and follower clusters with or without security ?
Hi!
We have queries that receive replication status, but not for all indices. We exclude indices that start from .
from all queries.
We configure leader cluster before running the autofollow
:
PUT /_cluster/settings
{
"persistent": {
"cluster": {
"remote": {
"leader-cluster": {
"seeds": ["url"]
}
}
}
}
}
Then start autofollow
with the following settings:
POST /_plugins/_replication/_autofollow
{
"leader_alias": "leader-cluster",
"pattern": "*",
"name": "replication",
"use_roles": {
"leader_cluster_role": "all_access",
"follower_cluster_role": "all_access"
}
}