ansible-role-for-splunk
ansible-role-for-splunk copied to clipboard
KVStore Tools
Summary
This PR provides additional KVStore tools available to the user to be configured, including:
- Disable
- Backup
- Oplog and storage engine set at install time
- KVStore Storage Engine & Server Version Upgrade
Possible other features to include:
- KVStore Restore from backup
- KVStore Clean
- KVStore Resync
- Oplog change in SHC
@arcsector This is a great PR, especially the migration part. As I mentioned in the other comments, the commands need authentication, so we either need to add them, or start the whole task with splunk login
so we don't need to set no_log
on all the commands. I have not tested the splunk login
method, but I assume it would work.
Fixed all the auth issues (sorry it slipped my mind) in https://github.com/splunk/ansible-role-for-splunk/pull/177/commits/ce2c80a4bbd06fedb016c01c403d2b927a4abdf6
Guess I accidentally created a merge commit - my bad. Feel free to remove - I'm not brave enough to force push to a fork.
After some preliminary testing, there are some issues that need to be addressed here.
- I think the
adhoc_destructive_resync_kvstore.yml
task should be removed from the PR. Some things there do not completely make sense to me. Maybe I'm understanding it wrong, but I don't think you need to remove a member from the SHC to do that. - The
kvstore_upgrade.yml
does not work on version 8.2.x because thesplunk_kvstore_version
does not return anything, so it fails on the conditional for the block. - Leaving the
splunk_kvstore_storage
as undefined, also will not update the engine because of the conditionals on the block level. At least on the 8.2 that I tested.
@jewnix My thoughts:
- Destructive resync is something that has been recommended to me through Splunk Support multiple times in order to fix KV issues, so I included it here - it seems that the advantage to it over just a normal clean is that it pulls both a fresh copy of the SH Bundle as well as the KV bundle, so that's the benefit there.
- Good call, I'll put a default value in there for versions <= 8.2
- That's what I want, I want people to have to specify they're upgrading in order to be able to do so one cluster component at a time.
@arcsector
- Destructive resync is something that has been recommended to me through Splunk Support multiple times in order to fix KV issues, so I included it here - it seems that the advantage to it over just a normal clean is that it pulls both a fresh copy of the SH Bundle as well as the KV bundle, so that's the benefit there.
So here is what I think. The destructive resync should be removed from this PR. 1. Because this is a snowflake issue, and destructive KVStore sync is not something that is documented. 2. Because this also destroys the SHC completely. Even though support sometimes tells you sometimes to do it, does not mean it's something that should be done.
@dtwersky @jewnix Sorry I've been inactive on this, I removed the destructive resync, and I added a default value, though it's not for splunk_kvstore_version
, rather for the splunk_current_server_version_out.stdout
check, where the former is powered by default vars, and the latter is performed by a CLI call
Updating this with oplog size increase, as well as some helpful tasks to get KVStore-status and SHCluster-status as JSON blobs for ansible consumption. I will note this isn't using the docs' oplog increase method, but rather a method that support had been passing around for ages a while ago, so if it is requested that it reflects this document, I can do that instead. Let me know!
Hi @arcsector ,
Sorry this was left dormant for so long after so much work has gone in to this. I have been working internally to figure out all of this for a while on a different project, that was more for ephemeral docker instances, but I discovered a lot of things related to this PR that made me look at KVStore upgrades a little differently. There are so many differences between Splunk versions, MongoDB versions and MongoDB engines regarding to upgrade paths. I'm not sure if we should assume that people are still running version 8, and because later versions already automatically migrate and update, there may only be a need to run some of these commands in specific scenarios only.
There are so many amazing things in this PR, and I don't want to close this out and start from fresh, but maybe it needs to be revisited, and think if we want to make this compatible with older versions, or major version jumps.
What are your thoughts?
Thanks so much for the positive comments, glad you like the materials here - I'm definitely open to revisiting this as a PR of optional tasks and then making a playbook that calls all of them to do an all-in-one upgrade. Does that sound like a good plan - I could even put them in a sub-folder roles/splunk/tasks/kvstore/...
if that would help to centralize this.
Do you happen to have a good map of those version transitions and what they entail as far as mongod version and engine? I'm having to go through the docs and switch back and forth between versions, as it's not clear what the approach should even be... Thanks!