graylog2-server icon indicating copy to clipboard operation
graylog2-server copied to clipboard

Support for searchable snapshots in datanode

Open todvora opened this issue 4 months ago • 3 comments

/jpd Graylog2/graylog-plugin-enterprise#6642

Description

This PR adds possibility to install repository-s3 plugin, configure necessary opensearch properties and add access and secret keys.

For details about searchable snapshots see https://opensearch.org/docs/latest/api-reference/snapshots/create-repository/

Motivation and Context

Graylog supports searchable snapshots for archiving, datanode has to allow the same functionality.

Technical details

Plugin download

The plugin itself is part of the datanode distribution, is predownloaded from https://artifacts.opensearch.org/releases/plugins/repository-s3/${opensearch.version}/repository-s3-${opensearch.version}.zip via maven.

Plugin installation

The plugin is not preinstalled, rather just distributed. During the startup of the datanode, if we detect that there are certain s3 repository configuration values, we'll install the plugin automatically from its ZIP file. For that we are using the opensearch-plugin CLI.

We don't preinstall the plugin because it's not needed everywhere and we'd have to handle its installation for two different architectures, x64 and aarch64. This approach adds more flexibility, removes build complexity and allows future dynamical plugin handling.

Keystore secrets

If the s3 repository configuration is detected, we'll create an opensearch keystore, using the opensearch-keystore CLI. Then we'll add s3.client.default.access_key and s3.client.default.secret_key values to the keystore.

Configuration options

All configuration properties related to the s3 repository plugin are managed by a new configuration class called S3RepositoryConfiguration. These properties are now present:

  • s3_client_default_access_key, optional, no default value
  • s3_client_default_secret_key, optional, no default value
  • s3_client_default_endpoint, optional, no default value
  • s3_client_default_protocol, required, http by default
  • s3_client_default_region, required, us-east-2 by default
  • s3_client_default_path_style_access, required, true by default

As with all other config properties, you can provide them to datanode as env variables, prefixing them with GRAYLOG_DATANODE_, e.g. GRAYLOG_DATANODE_S3_CLIENT_DEFAULT_ACCESS_KEY

New datanode config properties

  • opensearch_plugins_location points to the place where ZIP files with plugins are located. By default pointing to dist/plugins

  • node_search_cache_size, configure cache size for searchable snapshots. Currently set to 10gb.

  • Additionally we need to explicitly configure opensearch node roles, as the search role has to be present. These roles are now configured for each opensearch node:

  • cluster_manager

  • data

  • ingest

  • remote_cluster_client

  • search

Plugin enabled when

The plugin, its installation, configuration and all opensearch properties are active when user provides following configuration options:

  • s3_client_default_access_key
  • s3_client_default_secret_key
  • s3_client_default_endpoint

If all of them is missing, startup of the datanode will continue as usually, without the plugin. If the user configures only one or two of those, we'll throw an exception, because we don't want to support partial configuration that's neither correct nor wrong.

How Has This Been Tested?

Added integration test. The test starts a minio container and creates a bucket in it. Then datanode is started and configured with minio credentials. When the underlying opensearch starts, we'll tell it to configure an s3 repository and create a snapshot.

Types of changes

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Refactoring (non-breaking change)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • [x] My code follows the code style of this project.
  • [ ] My change requires a change to the documentation.
  • [ ] I have updated the documentation accordingly.
  • [x] I have read the CONTRIBUTING document.
  • [x] I have added tests to cover my changes.

todvora avatar Feb 22 '24 10:02 todvora