graylog2-server icon indicating copy to clipboard operation
graylog2-server copied to clipboard

Add global Index Sets Defaults system configuration

Open danotorrey opened this issue 1 year ago • 8 comments

Description

Adds a new Index Sets Configuration section to the System > Configurations page where defaults for the number of shards and replicas can be configured.

The current intention is to use these defaults for both standard index set creation, and for Illuminate index sets.

Closes https://github.com/Graylog2/graylog-plugin-enterprise/issues/3264

Motivation and Context

  • Allow configurable Illuminate defaults for shard and replica index set counts (VIA the UI and API). See https://github.com/Graylog2/graylog-plugin-enterprise/issues/3319
  • This new in-DB configuration allows values can be effective at runtime, instead of requiring a server reboot. This is important for Cloud, since rebooting is often not an option.
  • Adjust default values. See discussion and new default values here https://github.com/Graylog2/graylog-plugin-enterprise/issues/3264#issuecomment-1190402468.
  • Remove the elasticsearch_shards and elasticsearch_replicas server.conf properties, and redirect usages to the global config. https://github.com/Graylog2/graylog2-server/blob/7eb551e394ad177974dcc75e813f2de26460372c/graylog2-server/src/main/java/org/graylog2/configuration/ElasticsearchConfiguration.java#L45-L91.

Open Questions

  • [ ] Do the existing index set migrations have any impact on the plan for this issue? @danotorrey to review.

How Has This Been Tested?

TBD

Screenshots (if appropriate):

A new Index Sets Configuration section has been added: image

image

Open Tasks

  • [ ] Review existing index-set migrations to check if any adjustments are needed. Review them soon, so we can be aware of any potential issues.
  • [ ] Assign correct defaults.
  • [ ] Determine if init migration is still needed.
  • [ ] Use defaults in appropriate locations in code (for standard index set creation and Illuminate).
  • [ ] Create an issue for the ultimate removal of the server.conf values in Graylog 5.0.
  • [ ] Add support for index rotation and retention defaults (since those are pluggable, there is a bit more work involved).
  • [ ] Add unit tests.
  • [ ] Remove deprecated Java properties that are no longer used.

Types of changes

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [x] New feature (non-breaking change which adds functionality)
  • [ ] Refactoring (non-breaking change)
  • [ ] Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • [ ] My code follows the code style of this project.
  • [ ] My change requires a change to the documentation.
  • [ ] I have updated the documentation accordingly.
  • [ ] I have read the CONTRIBUTING document.
  • [ ] I have added tests to cover my changes.

danotorrey avatar Jul 07 '22 22:07 danotorrey

@danotorrey Thanks for working on this!

Identify the appropriate default values for both shards and replicas current defaults are 4 and 0 respectively

For me it would be fine to keep the current defaults for now, to not change existing behaviour. But of course it would be good to get more opinions.

boosty avatar Jul 08 '22 12:07 boosty

Thanks @boosty. I'll finish this up with current defaults, and we can continue accepting more feedback while the review is in progress.

danotorrey avatar Jul 08 '22 12:07 danotorrey

It turns out that the elasticsearch_shards and elasticsearch_replicas server settings are not actually used in the standard new index creation flow, and are only used in some older migrations.

The defaults for those are hard-coded in the frontend: https://github.com/Graylog2/graylog2-server/blob/85eb37534ebe8f77a3f78b80f340ed899b57ab4e/graylog2-web-interface/src/pages/IndexSetCreationPage.tsx#L126-L127

This probably makes it even easier to remove these config values, since not much is relying on them.

I'll work on migrating those hard-coded defaults to use the new in-database configuration.

danotorrey avatar Jul 08 '22 21:07 danotorrey

Looping in @dennisoelkers for early feedback in case you see anything noteworthy so far.

danotorrey avatar Jul 08 '22 21:07 danotorrey

Can you please check the existing index-set-related migrations if there is anything we need to adjust for this? Thank you!

@bernd Thanks for mentioning those. I saw them, but did look in detail at them yet. I will be sure to review as part of this change.

I think we should also add the other hardcoded index set settings to the default configuration while we are at it.

Thanks for the suggestion. Makes sense to me too. Will do.

danotorrey avatar Jul 11 '22 13:07 danotorrey

Just wanted to mention that once merged, we should decide if we are also changing the defaults as discussed here: https://github.com/Graylog2/graylog-plugin-enterprise/issues/3264

Could make sense to announce and document this together.

boosty avatar Jul 20 '22 15:07 boosty

Just wanted to mention that once merged, we should decide if we are also changing the defaults as discussed here: Graylog2/graylog-plugin-enterprise#3264

Could make sense to announce and document this together.

@boosty I agree. I think it makes sense to change the defaults in this PR too (and document/announce together), since a consensus appears to have been reached in https://github.com/Graylog2/graylog-plugin-enterprise/issues/3264.

danotorrey avatar Aug 03 '22 21:08 danotorrey

I am still making progress on this. I now have it working end-to-end. The Configurations page has an Index Defaults section, and the defaults are now used on the Create Index Set page.

image

These are the steps remaining. I will keep working on this, and hope to have this PR ready for review soon.

  • Add support for index rotation and retention defaults (since those are pluggable, there is a bit more work involved).
  • Add unit tests.
  • Remove deprecated Java properties that are no longer used. That is, as long as they are really no longer used. Otherwise, wait for 5.0.

danotorrey avatar Aug 24 '22 20:08 danotorrey

The frontend is now working end-to-end for all property types (including those that are pluggable [retention and rotation strategies]).

Now, I need to implement fall-back defaults support for the configuration properties (as mentioned here). I need to look at the best way to do that. I'll also need to coordinate any planned changes to the config props with @todvora based on any in-progress work.

I'll also need to do some frontend cleanup and add tests.

Definitely nearing completion of this task now. I am working to finish this up early next week. We will also need some time to complete the reviews.

danotorrey avatar Oct 21 '22 22:10 danotorrey

Hey @danotorrey, I have reverted all my changes, as they would cause only problems. Feel free to continue with your work. When you are done, I'll start with the prefix renaming, so we don't have git collisions everywhere.Thanks!

todvora avatar Oct 24 '22 06:10 todvora

@boosty @rich-graylog @waab76 An update on this feature with respect to Graylog 5.0:

Development of the core functionality (frontend/backend) is now done, which provides the ability to specify database-defaults at runtime VIA API and UI and allow those to be used when creating new index sets in the frontend and through Illuminate.

Now, bernd, @todvora, and I are reviewing the legacy server.conf configuration options, and assessing how best to handle/incorporate those with this change set. We are working to complete the review on those tomorrow.

Once we have finished the above-mentioned review, we will need to discuss this with Product to determine the plan for including this in Graylog 5.0. The first beta will be built tomorrow, and this change will not be done/reviewed/merged in-time for that. So, we will need to discuss the risk/benefit tradeoffs and decide if we ship the complete feature in 5.0, or defer certain parts to 5.1. I will summarize here for feedback, or set up a call to discuss within the next days.

danotorrey avatar Oct 25 '22 14:10 danotorrey

@Graylog2/secdev After several iterations, and some requirements adjustments, this is nearly complete and ready for some feedback.

There are a few small tasks left (described above), which I will finish shortly. I'll also get the builds passing. But feedback would be appreciated if anyone has time to take a look/test it out.

danotorrey avatar Nov 03 '22 22:11 danotorrey

This PR is now complete and ready for review. I will be adding tests shortly.

I've added a few additional reviewers for a few extra sets of eyes on this, since the changes here affect legacy configuration values.

danotorrey avatar Nov 09 '22 13:11 danotorrey

Found a small bug when testing.

  • Change Rotation Strategy to Index Time
  • Toggle Rotate Empty Index Sets and save the config
  • Rotation period should be shown as 0 seconds and the configuration is in an error state if you try to edit again.
  • Go back in and update to a valid Rotation Period and save the config
  • Edit the config again and Rotate Empty Index Sets is no longer true

Looks like the Rotate Empty Index Sets toggle has some faulty logic.

kingzacko1 avatar Nov 09 '22 15:11 kingzacko1

Found a small bug when testing.

  • Change Rotation Strategy to Index Time ...

Thanks @kingzacko1! Great find. That was a really interesting bug. I have fixed that in the latest commit and added the Rotate empty index set field to the summary view on the Configurations page too. Looks like both values are saving correctly now. Turns out a similar bug existed on the Edit Index Set page too (when editing a period, the Rotate Empty Indexes box would get unchecked). Double win! Thanks for seeing this.

image

danotorrey avatar Nov 09 '22 21:11 danotorrey

@kingzacko1 @ryan-carroll-graylog I think I have addressed the recent feedback, so I think this is ready for another look.

danotorrey avatar Nov 14 '22 20:11 danotorrey

@danotorrey Testing a server start with freshly blown away DBs, and the following fields overridden in my graylog.conf:

elasticsearch_analyzer = standard
rotation_strategy = size
elasticsearch_max_docs_per_index = 69
elasticsearch_max_size_per_index = 69
elasticsearch_max_number_of_indices = 69
retention_strategy = close
elasticsearch_shards = 6
elasticsearch_replicas = 6
index_field_type_periodical_full_refresh_interval = 69s

I got the following results where some indexes took the configuration file values and some didn't (Yellow looks like the defaults were used instead of the conf values; Green is neither...): image

I'm not sure how much of this is by design but I thought it merited calling out.

ryan-carroll-graylog avatar Nov 18 '22 20:11 ryan-carroll-graylog

some indexes took the configuration file values and some didn't

Thanks seeing that @ryan-carroll-graylog. That is definitely an issue. I found the cause and will fix it.

danotorrey avatar Nov 21 '22 21:11 danotorrey

Thanks for the additional feedback @kingzacko1! I'll clean those up.

@ryan-carroll-graylog I have been working on getting consistency among the system-created indices. I think I have almost made it through those updates. It was a little tricky, but I think I have it figured out. Should be pushing that up today or tomorrow.

danotorrey avatar Nov 30 '22 20:11 danotorrey

Thanks for the additional feedback @kingzacko1! I'll clean those up.

@ryan-carroll-graylog I have been working on getting consistency among the system-created indices. I think I have almost made it through those updates. It was a little tricky, but I think I have it figured out. Should be pushing that up today or tomorrow.

Ok awesome, I'll keep an eye out for the updates.

ryan-carroll-graylog avatar Nov 30 '22 20:11 ryan-carroll-graylog

@kingzacko1 @ryan-carroll-graylog Thanks for the additional reviews. I think I have addressed all points of feedback. I also just pushed up https://github.com/Graylog2/graylog2-server/pull/13018/commits/854695a6d552552840ab9273f0ffc13d9dd4d19e and https://github.com/Graylog2/graylog-plugin-enterprise/pull/3803/commits/f98c14cf3f1a2e06035d00ad49c23b1c75f78007 in the related PR that ensures all system indexes (listed below) use the appropriate defaults on first server startup. That was a little tricky, but I tried to centralize the configuration settings as much as possible. I definitely would appreciate a look over that commit.

  • Default index set
  • Graylog Events
  • Graylog System Events
  • Graylog Message Failures
  • Restored Archives
  • Illuminate indices

I still need to update the change log with the system index info, and I think there are some failing tests too, which I will address first thing in the morn.

danotorrey avatar Nov 30 '22 21:11 danotorrey

@ryan-carroll-graylog @kingzacko1 FYI, I just pushed up https://github.com/Graylog2/graylog2-server/pull/13018/commits/1fbf9fae2282b5c6deb64bcfe062d79cde222d91, which reverts the default index set creation migration back to it's original contents (matching master). That migration is unique, since back a long time ago, the default index settings were configurable in the server.conf file for a while, then, they were migrated to be configurable in the Cluster Config directly. Then, finally, they were migrated to the index_sets collection (VIA IndexConfig object). I had changed that migration to read from the server.conf directly, but it actually must still read from the Custer Config in the slim case that a user has modified the config there before upgrading to a subsequent version. Very slim case that could happen since the migrations are old, but I figured better to leave that one as it was and avoid possible issues.

The other default index migrations are not affected by this, since they all create new index sets that had not existed prior, so those can read from the server.conf (Elasticsearch configuration) as they are currently coded.

This should wrap up the changes needed for migrations. I'll update the changelog this afternoon, and unless you see any other issues, I think this is probably ready for merging and letting it get some time on master for testing be fore 5.1. Let me know what you think!

danotorrey avatar Dec 02 '22 17:12 danotorrey

Thanks for all of the time and efforts reviewing this @ryan-carroll-graylog and @kingzacko1! I've just pushed the final update to the change log. I will plan to merge this as soon as it turns green!

danotorrey avatar Dec 02 '22 21:12 danotorrey

Thanks @danotorrey and all reviewers!

This was a quite a significant change, but I think will be totally worth it.

boosty avatar Dec 05 '22 10:12 boosty

@boosty Thanks for the review and feedback. I've addressed both points. Can you please take one more look to make sure all looks ok? I have also added a test covering the default values, which is useful to see them at-a-glance: https://github.com/Graylog2/graylog2-server/blob/1865e1bbd285d06fbd2d202baf11bc906f55d075/graylog2-server/src/test/java/org/graylog2/configuration/ElasticsearchConfigurationTest.java#L37-L60

This was a quite a significant change, but I think will be totally worth it.

I definitely agree. I really like where this ended up. I really appreciate the feedback/guidance along the way.

danotorrey avatar Dec 05 '22 16:12 danotorrey

@ryan-carroll-graylog @kingzacko1 When you have some time, can you please review the latest changes? Mostly just updates to the change log and upgrade notes + defaults.

danotorrey avatar Dec 06 '22 16:12 danotorrey

Thanks the the feedback @boosty! I've cleaned those up.

danotorrey avatar Dec 06 '22 16:12 danotorrey

Unless any other feedback is received before, I will plan to merge this PR on Monday.

danotorrey avatar Dec 08 '22 20:12 danotorrey

I might have missed the conversation, but didn't we want to prefix the default settings with default_ and use the JadConfig fallback property to keep backward compatibility?

Thanks for mentioning this @bernd. The config library fallbackPropertyName capability only allows us the ability to specify a single fallback property (https://github.com/Graylog2/JadConfig/pull/113). So, we will need to coordinate and change to add default_ or index_init_default_ with the more general naming changes that are currently being discussed here https://github.com/Graylog2/graylog2-server/issues/13927. I suggest that we reach a renaming conclusion, then implement that in a single/standalone PR for clarity. This PR is already quite large. What do you think? If you agree, I will create an issue to track that. Looping in @todvora, since I believe he was going to work on the actual renaming change.

danotorrey avatar Dec 12 '22 13:12 danotorrey

I might have missed the conversation, but didn't we want to prefix the default settings with default_ and use the JadConfig fallback property to keep backward compatibility?

Thanks for mentioning this @bernd. The config library fallbackPropertyName capability only allows us the ability to specify a single fallback property (Graylog2/JadConfig#113). So, we will need to coordinate and change to add default_ or index_init_default_ with the more general naming changes that are currently being discussed here #13927. I suggest that we reach a renaming conclusion, then implement that in a single/standalone PR for clarity. This PR is already quite large. What do you think? If you agree, I will create an issue to track that. Looping in @todvora, since I believe he was going to work on the actual renaming change.

Thanks for the update, @danotorrey. I am good with doing the renaming in a separate PR. Thank you!

bernd avatar Dec 12 '22 14:12 bernd