resource-agents icon indicating copy to clipboard operation
resource-agents copied to clipboard

Enhance galera to interact over multiple clusters

Open zzzeek opened this issue 7 years ago • 7 comments

This change adds a new resource agent "stretch_galera" which builds off of the existing "galera" agent. To accommodate this, the "galera" agent's shell script structure is modified slightly so that it can be sourced for its functions.

The new resource agent adds a new parameter "remote_node_map" to the Galera resource agent which allows it to consider galera node names that are in other clusters as part of its Galera quorum. To achieve this, it launches read-only pcs commands to the remote clusters in order to view and modify remote state variables.

Additionally, the stretch agent honors an optional pcs attribute -initial-bootstrap which when applied to the local pcs nodes, will allow Galera to be bootstrapped with only that subset of nodes, without the additional remote nodes being available yet. An installer can set these attributes to allow the first pcs cluster to come online before subsequent clusters, and then remove the attributes.

zzzeek avatar Apr 19 '18 21:04 zzzeek

Can one of the admins verify this patch?

knet-ci-bot avatar Apr 19 '18 21:04 knet-ci-bot

add to whitelist

fabbione avatar Apr 20 '18 03:04 fabbione

cc @beekhof

zzzeek avatar Apr 20 '18 13:04 zzzeek

TBH, this scares the heck out of me.

At the minimum, can we put this in a new agent (source the original then add the new functionality)?

I can see the need for special handling (what to do if the remote isn't reachable for example) that I'd not want to complicate the non-stretch version with.

beekhof avatar Jun 14 '18 01:06 beekhof

@beekhof as far as sourcing the original, I will need to add a line to the case statement at the end that upon passing a command like "sourceonly" does a simple "return", to avoid the "exit" call. am not finding any bash magic to source all the variables and having them survive past an "exit".

zzzeek avatar Jul 26 '18 19:07 zzzeek

@beekhof note this change no longer writes any data over SSH, and only reports status, and to that end we can also replace the usage of SSH with an xinetd service linked to a script. however, in the xinetd case, authentication and encryption go out the window as far as I understand being able to create xinetd services. the security audit will then say, if the xinetd script itself has a vulnerability, it's openly exposed.

another option is to use SSH but to limit the commands / scripts that the user can execute. this can be done either with a custom shell in /etc/passwd or apparently you can limit commands for a specific key in authorized_keys, been googling that a bit. I think having a "front end" that is reached via ssh but nonetheless is just a single script with an argument is the least we can do.

zzzeek avatar Aug 08 '18 14:08 zzzeek

@zzzeek can you fix the quote-issues reported by Travis CI?

oalbrigt avatar Aug 13 '18 11:08 oalbrigt