geospatial icon indicating copy to clipboard operation
geospatial copied to clipboard

[BUG] 2.10 Failing integration tests for GeoSpatial Plugin due no index found: security-auditlog-2023.09.11

Open navneet1v opened this issue 2 years ago • 8 comments

Hi, While running the integration tests for Geospatial plugin for the 2.10 release, the integration tests are logging the below error which is resulting in failures of integration tests. This is blocking the release of OpenSearch 2.10 version.

[2023-09-11T04:08:24,874][ERROR][o.o.s.a.s.InternalOpenSearchSink] [node_name_9200] Unable to index audit log {"audit_cluster_name":"opensearchcluster1","audit_transport_headers":{"_system_index_access_allowed":"false"},"audit_node_name":"node_name_9200","audit_trace_task_id":"aeQIHqurTY6cwJp-h8rLqQ:3984","audit_transport_request_type":"Request","audit_category":"INDEX_EVENT","audit_request_origin":"REST","audit_node_id":"aeQIHqurTY6cwJp-h8rLqQ","audit_request_layer":"TRANSPORT","@timestamp":"2023-09-11T04:08:24.472+00:00","audit_format_version":4,"audit_request_remote_address":"127.0.0.1","audit_request_privilege":"indices:admin/data_stream/delete","audit_node_host_address":"127.0.0.1","audit_request_effective_user":"admin","audit_trace_indices":["*"],"audit_trace_resolved_indices":[".opendistro_security",".opensearch-observability",".plugins-ml-config",".opendistro-job-scheduler-lock"],"audit_node_host_name":"127.0.0.1"} due to
org.opensearch.index.IndexNotFoundException: no such index [security-auditlog-2023.09.11]
	at org.opensearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:287) ~[opensearch-2.10.0.jar:2.10.0]
	at org.opensearch.cluster.metadata.IndexNameExpressionResolver.concreteWriteIndex(IndexNameExpressionResolver.java:482) ~[opensearch-2.10.0.jar:2.10.0]
	at org.opensearch.action.bulk.TransportBulkAction$ConcreteIndices.resolveIfAbsent(TransportBulkAction.java:801) ~[opensearch-2.10.0.jar:2.10.0]
	at org.opensearch.action.bulk.TransportBulkAction$BulkOperation.addFailureIfIndexIsUnavailable(TransportBulkAction.java:743) ~[opensearch-2.10.0.jar:2.10.0]
	at org.opensearch.action.bulk.TransportBulkAction$BulkOperation.doRun(TransportBulkAction.java:532) ~[opensearch-2.10.0.jar:2.10.0]
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) ~[opensearch-2.10.0.jar:2.10.0]
	at org.opensearch.action.bulk.TransportBulkAction.executeBulk(TransportBulkAction.java:774) ~[opensearch-2.10.0.jar:2.10.0]
	at org.opensearch.action.bulk.TransportBulkAction$1$1.doRun(TransportBulkAction.java:321) ~[opensearch-2.10.0.jar:2.10.0]
	at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:908) ~[opensearch-2.10.0.jar:2.10.0]
	at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) ~[opensearch-2.10.0.jar:2.10.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) [?:?]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) [?:?]
	at java.lang.Thread.run(Thread.java:833) [?:?]

https://ci.opensearch.org/ci/dbc/integ-test/2.10.0/8444/linux/x64/tar/test-results/5752/integ-test/geospatial/with-security/local-cluster-logs/opensearch-service-logs/opensearchcluster1.log

navneet1v avatar Sep 12 '23 16:09 navneet1v

cc: @vamshin , @CEHENKLE

navneet1v avatar Sep 12 '23 17:09 navneet1v

Suspicious line from the logs:

[2023-09-12T04:06:33,123][INFO ][o.o.c.m.MetadataDeleteIndexService] [node_name_9200] [security-auditlog-2023.09.12/i5WCBYiSRP-rWdZp2NdjhQ] deleting index

Edit: Weird, it is trying to index into the audit log the action of it deleting the audit log index:

Unable to index audit log 
{
    "audit_cluster_name": "opensearchcluster1",
    "audit_transport_headers":
    {
        "_system_index_access_allowed": "false"
    },
    "audit_node_name": "node_name_9200",
    "audit_trace_task_id": "BYVPebDwSTKcagLFfxOZog:1491",
    "audit_transport_request_type": "DeleteIndexRequest",
    "audit_category": "INDEX_EVENT",
    "audit_request_origin": "REST",
    "audit_node_id": "BYVPebDwSTKcagLFfxOZog",
    "audit_request_layer": "TRANSPORT",
    "@timestamp": "2023-09-12T04:06:36.444+00:00",
    "audit_format_version": 4,
    "audit_request_remote_address": "127.0.0.1",
    "audit_request_privilege": "indices:admin/delete",
    "audit_node_host_address": "127.0.0.1",
    "audit_request_effective_user": "admin",
    "audit_trace_indices":
    [
        "security-auditlog-2023.09.12"
    ],
    "audit_trace_resolved_indices":
    [
        "security-auditlog-2023.09.12"
    ],
    "audit_node_host_name": "127.0.0.1"
}

cwperks avatar Sep 12 '23 17:09 cwperks

Screenshot 2023-09-12 at 2 00 52 PM

The mapping is updated which deletes the index for the old auditlog.

stephen-crawford avatar Sep 12 '23 18:09 stephen-crawford

@scrawfor99 is this something plugin tests are doing or being done by any async process.

navneet1v avatar Sep 12 '23 18:09 navneet1v

This is not security functionality. We have made no changes to the audit logging features in this release and all other integration tests pass smoothly. I suspect that this is something with Geospatial but I don't know what those tests do. We can see here: Screenshot 2023-09-12 at 2 11 48 PM

That there is an attempt to log the deletion of the audit index. This will cause problems since a few lines above we have:

INFO ][o.o.c.m.MetadataDeleteIndexService] [node_name_9200] [security-auditlog-2023.09.12/X-by9snOQHCJX27Qw3DBzQ] deleting index

Which deletes the index unless I am mistaken.

stephen-crawford avatar Sep 12 '23 18:09 stephen-crawford

Could this try to be deleting the auditlog? https://github.com/opensearch-project/geospatial/blob/main/src/test/java/org/opensearch/geospatial/OpenSearchSecureRestTestCase.java#L178

Have geospatial integ tests run with security before?

cwperks avatar Sep 12 '23 18:09 cwperks

@cwperks yes they were running with security. So it could be some recent change that happened in the Security plugin

navneet1v avatar Sep 12 '23 18:09 navneet1v

As part of unblocking the 2.10 release I did a small hack to remove the index from deletion: https://github.com/opensearch-project/geospatial/pull/430

Two followup action items:

  1. Enabling the secure cluster integ tests in GeoSpatial
  2. Delete the index after the tests is run and don't do a blanket delete of the indices.

@vamshin we will need some prioritization for this.

navneet1v avatar Sep 12 '23 19:09 navneet1v