[BUG] 2.10 Failing integration tests for GeoSpatial Plugin due no index found: security-auditlog-2023.09.11
Hi, While running the integration tests for Geospatial plugin for the 2.10 release, the integration tests are logging the below error which is resulting in failures of integration tests. This is blocking the release of OpenSearch 2.10 version.
[2023-09-11T04:08:24,874][ERROR][o.o.s.a.s.InternalOpenSearchSink] [node_name_9200] Unable to index audit log {"audit_cluster_name":"opensearchcluster1","audit_transport_headers":{"_system_index_access_allowed":"false"},"audit_node_name":"node_name_9200","audit_trace_task_id":"aeQIHqurTY6cwJp-h8rLqQ:3984","audit_transport_request_type":"Request","audit_category":"INDEX_EVENT","audit_request_origin":"REST","audit_node_id":"aeQIHqurTY6cwJp-h8rLqQ","audit_request_layer":"TRANSPORT","@timestamp":"2023-09-11T04:08:24.472+00:00","audit_format_version":4,"audit_request_remote_address":"127.0.0.1","audit_request_privilege":"indices:admin/data_stream/delete","audit_node_host_address":"127.0.0.1","audit_request_effective_user":"admin","audit_trace_indices":["*"],"audit_trace_resolved_indices":[".opendistro_security",".opensearch-observability",".plugins-ml-config",".opendistro-job-scheduler-lock"],"audit_node_host_name":"127.0.0.1"} due to
org.opensearch.index.IndexNotFoundException: no such index [security-auditlog-2023.09.11]
at org.opensearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:287) ~[opensearch-2.10.0.jar:2.10.0]
at org.opensearch.cluster.metadata.IndexNameExpressionResolver.concreteWriteIndex(IndexNameExpressionResolver.java:482) ~[opensearch-2.10.0.jar:2.10.0]
at org.opensearch.action.bulk.TransportBulkAction$ConcreteIndices.resolveIfAbsent(TransportBulkAction.java:801) ~[opensearch-2.10.0.jar:2.10.0]
at org.opensearch.action.bulk.TransportBulkAction$BulkOperation.addFailureIfIndexIsUnavailable(TransportBulkAction.java:743) ~[opensearch-2.10.0.jar:2.10.0]
at org.opensearch.action.bulk.TransportBulkAction$BulkOperation.doRun(TransportBulkAction.java:532) ~[opensearch-2.10.0.jar:2.10.0]
at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) ~[opensearch-2.10.0.jar:2.10.0]
at org.opensearch.action.bulk.TransportBulkAction.executeBulk(TransportBulkAction.java:774) ~[opensearch-2.10.0.jar:2.10.0]
at org.opensearch.action.bulk.TransportBulkAction$1$1.doRun(TransportBulkAction.java:321) ~[opensearch-2.10.0.jar:2.10.0]
at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:908) ~[opensearch-2.10.0.jar:2.10.0]
at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) ~[opensearch-2.10.0.jar:2.10.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
at java.lang.Thread.run(Thread.java:833) [?:?]
https://ci.opensearch.org/ci/dbc/integ-test/2.10.0/8444/linux/x64/tar/test-results/5752/integ-test/geospatial/with-security/local-cluster-logs/opensearch-service-logs/opensearchcluster1.log
cc: @vamshin , @CEHENKLE
Suspicious line from the logs:
[2023-09-12T04:06:33,123][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_name_9200] [security-auditlog-2023.09.12/i5WCBYiSRP-rWdZp2NdjhQ] deleting index
Edit: Weird, it is trying to index into the audit log the action of it deleting the audit log index:
Unable to index audit log
{
"audit_cluster_name": "opensearchcluster1",
"audit_transport_headers":
{
"_system_index_access_allowed": "false"
},
"audit_node_name": "node_name_9200",
"audit_trace_task_id": "BYVPebDwSTKcagLFfxOZog:1491",
"audit_transport_request_type": "DeleteIndexRequest",
"audit_category": "INDEX_EVENT",
"audit_request_origin": "REST",
"audit_node_id": "BYVPebDwSTKcagLFfxOZog",
"audit_request_layer": "TRANSPORT",
"@timestamp": "2023-09-12T04:06:36.444+00:00",
"audit_format_version": 4,
"audit_request_remote_address": "127.0.0.1",
"audit_request_privilege": "indices:admin/delete",
"audit_node_host_address": "127.0.0.1",
"audit_request_effective_user": "admin",
"audit_trace_indices":
[
"security-auditlog-2023.09.12"
],
"audit_trace_resolved_indices":
[
"security-auditlog-2023.09.12"
],
"audit_node_host_name": "127.0.0.1"
}
The mapping is updated which deletes the index for the old auditlog.
@scrawfor99 is this something plugin tests are doing or being done by any async process.
This is not security functionality. We have made no changes to the audit logging features in this release and all other integration tests pass smoothly. I suspect that this is something with Geospatial but I don't know what those tests do. We can see here:
That there is an attempt to log the deletion of the audit index. This will cause problems since a few lines above we have:
INFO ][o.o.c.m.MetadataDeleteIndexService] [node_name_9200] [security-auditlog-2023.09.12/X-by9snOQHCJX27Qw3DBzQ] deleting index
Which deletes the index unless I am mistaken.
Could this try to be deleting the auditlog? https://github.com/opensearch-project/geospatial/blob/main/src/test/java/org/opensearch/geospatial/OpenSearchSecureRestTestCase.java#L178
Have geospatial integ tests run with security before?
@cwperks yes they were running with security. So it could be some recent change that happened in the Security plugin
As part of unblocking the 2.10 release I did a small hack to remove the index from deletion: https://github.com/opensearch-project/geospatial/pull/430
Two followup action items:
- Enabling the secure cluster integ tests in GeoSpatial
- Delete the index after the tests is run and don't do a blanket delete of the indices.
@vamshin we will need some prioritization for this.