mariadb-operator icon indicating copy to clipboard operation
mariadb-operator copied to clipboard

[Feature] Support joining existing galera cluster (For eg: across 2 different k8s clusters)

Open starizard opened this issue 1 year ago • 11 comments

Is your feature request related to a problem? Please describe. After deploying the operator in a k8s cluster, we can create a new galera cluster. But in case we have many k8s clustesr that need to share the same galera cluster it's not possible.

Describe the solution you'd like The ability to create a MariaDB CR with wsrep_cluster_address of the db nodes in another kubernetes cluster ( which exposed via loadbalancer)

starizard avatar Apr 27 '24 01:04 starizard

Hey there @starizard! Thanks for bringing this up

We have thought about this already:

  • https://github.com/mariadb-operator/mariadb-operator/issues/220

Although possible, there are a couple of things that need to be covered for this:

  • Per node Service, allowing to connect to each of the nodes individually from the outside of the cluster (LoadBalancer)
  • We will need to extend spec.galera to specify:
    • Extra peers FDQN to connect to. As you said, this peers will be included in wsrep_cluster_address
    • How to authenticate connections with them
    • How to trust TLS connections with them

This new topology has implications we haven't faced before, so we will need to further investigate these points:

  • An operator is only able to manage Pods within the cluster it runs, how do we manage the Pods in an external cluster? Another operator running in the external cluster? If that was the case, we need to take into account the following considerations:
    • There should be only one Galera cluster running across 2 Kubernetes clusters. Creating 2 different Galera clusters will mean that we have a split brain
    • Cluster recovery process gets tricky, we cannot control external Pods and therefore we can't get the sequence number on them to know which is the most advanced node and therefore where to bootstrap the new cluster

To overcome the previous points, I can think of:

  • Not allowing writes on the external cluster, this way, we know for sure that the most advanced node will be in the current cluster and we can perform the cluster recovery. However, we still need to restart the Pods in the external cluster when that happens, this cross-cluster coordination will be tricky if not imposible.
  • Have 2 different Galera clusters in 2 different Kubernetes clusters and setup replication between them. Writes can only happen on one of the cluster and the initial replication setup can be done by one of the operators, since it is done via cross-cluster SQL statements

Taking into account all my previous points, the latter option seems to be the most reasonable one TBH, and it seems to be a common pattern we can automate:

  • https://mariadb.com/kb/en/configuring-mariadb-replication-between-two-mariadb-galera-clusters/

Sorry if this was too long, happy to hear your thoughts!

mmontes11 avatar May 03 '24 08:05 mmontes11

I would love this to be possible to include servers outside of kubernetes aswell

cyford avatar May 15 '24 16:05 cyford

Just dropping in here. @mmontes11 what if the operator had access and permissions to the "remote" cluster so it could control pods?

Then you'd have one cluster as the primary, and the other(s) being controlled by the primary?

Of course this requires networking to work, whilst someone could implement routing, the other option is wireguard to create the cross-cluster links for wsrep? Perhaps that implementation detail is left to the user?

Are there any other gotchas that we'd have to look out for? Our use-case would be a cluster between two countries, approx 50ms between the two

chriswiggins avatar Feb 05 '25 21:02 chriswiggins

Also, just dropping in here @chriswiggins, I think supporting such a use case would require a lot of changes. For example, instead of using a StatefulSet, it would rely on plain Pod resources. Kubernetes cluster access handling and related considerations would also come into play.

That said, I think the best option would be to have a single cluster spread across multiple Kubernetes clusters. Networking should definitely be something the user configures, but we would need to establish some prerequisites.

The approach of replicating between two clusters is easier to implement but wouldn't benefit from features like MaxScale, automatic recovery, and similar mechanisms. I would also love to see a Galera cluster span multiple Kubernetes clusters, but my impression is that achieving this would require substantial changes - almost like developing a new solution from scratch.

I've experimented with tools like Admiralty and Karmada. Karmada, as far as I know, isn't an option because it doesn’t support installing operators. Admiralty might work, but honestly, I’m tired of testing different solutions. In short, Admiralty treats Kubernetes clusters as nodes, so setting anti-affinity on the MariaDB StatefulSet might be enough to stretch it across clusters. However, I’m not entirely sure how status reflection on statefulsets works or whether MariaDB can handle operations on proxy pods correctly.

If it can - either as-is or with only minor modifications - this could be a relatively low-cost engineering solution for enabling multi-cluster support in the MariaDB operator.

@chriswiggins , is this something you'd be interested in testing?

3deep5me avatar Feb 15 '25 14:02 3deep5me

Some more input; spread galera clusters between different regions are not so easy because of latency and so on. A good middle way good the the replication similar to this with manual failover: https://cloudnative-pg.io/documentation/current/replica_cluster/#distributed-topology

3deep5me avatar Feb 15 '25 15:02 3deep5me

Hey there @starizard! Thanks for bringing this up

We have thought about this already:

Although possible, there are a couple of things that need to be covered for this:

  • Per node Service, allowing to connect to each of the nodes individually from the outside of the cluster (LoadBalancer)

  • We will need to extend spec.galera to specify:

    • Extra peers FDQN to connect to. As you said, this peers will be included in wsrep_cluster_address
    • How to authenticate connections with them
    • How to trust TLS connections with them

This new topology has implications we haven't faced before, so we will need to further investigate these points:

  • An operator is only able to manage Pods within the cluster it runs, how do we manage the Pods in an external cluster? Another operator running in the external cluster? If that was the case, we need to take into account the following considerations:

    • There should be only one Galera cluster running across 2 Kubernetes clusters. Creating 2 different Galera clusters will mean that we have a split brain
    • Cluster recovery process gets tricky, we cannot control external Pods and therefore we can't get the sequence number on them to know which is the most advanced node and therefore where to bootstrap the new cluster

To overcome the previous points, I can think of:

  • Not allowing writes on the external cluster, this way, we know for sure that the most advanced node will be in the current cluster and we can perform the cluster recovery. However, we still need to restart the Pods in the external cluster when that happens, this cross-cluster coordination will be tricky if not imposible.
  • Have 2 different Galera clusters in 2 different Kubernetes clusters and setup replication between them. Writes can only happen on one of the cluster and the initial replication setup can be done by one of the operators, since it is done via cross-cluster SQL statements

Taking into account all my previous points, the latter option seems to be the most reasonable one TBH, and it seems to be a common pattern we can automate:

  • https://mariadb.com/kb/en/configuring-mariadb-replication-between-two-mariadb-galera-clusters/

Sorry if this was too long, happy to hear your thoughts!

@mmontes11 is the option "2 different Galera clusters in 2 different Kubernetes clusters and setup replication between them" achievable with the current operator, even if it requires some manual steps and isn’t fully automated?

laurentiusoica avatar Oct 06 '25 14:10 laurentiusoica

@laurentiusoica

I performed a quick test as described below. Please note that this is not running at a Production level; I only implemented a basic proof-of-concept.

  1. I deployed each Galera Cluster on each Kubernetes cluster.

  2. Then, I accessed one of the MariaDB Galera Cluster pods that would act as the slave (e.g., mariadb-cluster-0) and manually configured the replication based on the following document https://mariadb.com/kb/en/configuring-mariadb-replication-between-two-mariadb-galera-clusters/. Of course, the necessary network, gateway, and VIP configurations for the replication had to be set up first.

  3. At this point, I was able to confirm that the replication between the clusters is 'working' (functioning).

Finally, I truly hope that native replication functionality between different clusters becomes available as well.

aoc55 avatar Nov 12 '25 06:11 aoc55