[Feature] Support joining existing galera cluster (For eg: across 2 different k8s clusters)
Is your feature request related to a problem? Please describe. After deploying the operator in a k8s cluster, we can create a new galera cluster. But in case we have many k8s clustesr that need to share the same galera cluster it's not possible.
Describe the solution you'd like
The ability to create a MariaDB CR with wsrep_cluster_address of the db nodes in another kubernetes cluster ( which exposed via loadbalancer)
Hey there @starizard! Thanks for bringing this up
We have thought about this already:
- https://github.com/mariadb-operator/mariadb-operator/issues/220
Although possible, there are a couple of things that need to be covered for this:
- Per node
Service, allowing to connect to each of the nodes individually from the outside of the cluster (LoadBalancer) - We will need to extend
spec.galerato specify:- Extra peers FDQN to connect to. As you said, this peers will be included in
wsrep_cluster_address - How to authenticate connections with them
- How to trust TLS connections with them
- Extra peers FDQN to connect to. As you said, this peers will be included in
This new topology has implications we haven't faced before, so we will need to further investigate these points:
- An operator is only able to manage
Podswithin the cluster it runs, how do we manage thePodsin an external cluster? Another operator running in the external cluster? If that was the case, we need to take into account the following considerations:- There should be only one Galera cluster running across 2 Kubernetes clusters. Creating 2 different Galera clusters will mean that we have a split brain
- Cluster recovery process gets tricky, we cannot control external
Podsand therefore we can't get the sequence number on them to know which is the most advanced node and therefore where to bootstrap the new cluster
To overcome the previous points, I can think of:
- Not allowing writes on the external cluster, this way, we know for sure that the most advanced node will be in the current cluster and we can perform the cluster recovery. However, we still need to restart the
Podsin the external cluster when that happens, this cross-cluster coordination will be tricky if not imposible. - Have 2 different Galera clusters in 2 different Kubernetes clusters and setup replication between them. Writes can only happen on one of the cluster and the initial replication setup can be done by one of the operators, since it is done via cross-cluster SQL statements
Taking into account all my previous points, the latter option seems to be the most reasonable one TBH, and it seems to be a common pattern we can automate:
- https://mariadb.com/kb/en/configuring-mariadb-replication-between-two-mariadb-galera-clusters/
Sorry if this was too long, happy to hear your thoughts!
I would love this to be possible to include servers outside of kubernetes aswell
Just dropping in here. @mmontes11 what if the operator had access and permissions to the "remote" cluster so it could control pods?
Then you'd have one cluster as the primary, and the other(s) being controlled by the primary?
Of course this requires networking to work, whilst someone could implement routing, the other option is wireguard to create the cross-cluster links for wsrep? Perhaps that implementation detail is left to the user?
Are there any other gotchas that we'd have to look out for? Our use-case would be a cluster between two countries, approx 50ms between the two
Also, just dropping in here @chriswiggins, I think supporting such a use case would require a lot of changes. For example, instead of using a StatefulSet, it would rely on plain Pod resources. Kubernetes cluster access handling and related considerations would also come into play.
That said, I think the best option would be to have a single cluster spread across multiple Kubernetes clusters. Networking should definitely be something the user configures, but we would need to establish some prerequisites.
The approach of replicating between two clusters is easier to implement but wouldn't benefit from features like MaxScale, automatic recovery, and similar mechanisms. I would also love to see a Galera cluster span multiple Kubernetes clusters, but my impression is that achieving this would require substantial changes - almost like developing a new solution from scratch.
I've experimented with tools like Admiralty and Karmada. Karmada, as far as I know, isn't an option because it doesn’t support installing operators. Admiralty might work, but honestly, I’m tired of testing different solutions. In short, Admiralty treats Kubernetes clusters as nodes, so setting anti-affinity on the MariaDB StatefulSet might be enough to stretch it across clusters. However, I’m not entirely sure how status reflection on statefulsets works or whether MariaDB can handle operations on proxy pods correctly.
If it can - either as-is or with only minor modifications - this could be a relatively low-cost engineering solution for enabling multi-cluster support in the MariaDB operator.
@chriswiggins , is this something you'd be interested in testing?
Some more input; spread galera clusters between different regions are not so easy because of latency and so on. A good middle way good the the replication similar to this with manual failover: https://cloudnative-pg.io/documentation/current/replica_cluster/#distributed-topology
Hey there @starizard! Thanks for bringing this up
We have thought about this already:
Although possible, there are a couple of things that need to be covered for this:
Per node
Service, allowing to connect to each of the nodes individually from the outside of the cluster (LoadBalancer)We will need to extend
spec.galerato specify:
- Extra peers FDQN to connect to. As you said, this peers will be included in
wsrep_cluster_address- How to authenticate connections with them
- How to trust TLS connections with them
This new topology has implications we haven't faced before, so we will need to further investigate these points:
An operator is only able to manage
Podswithin the cluster it runs, how do we manage thePodsin an external cluster? Another operator running in the external cluster? If that was the case, we need to take into account the following considerations:
- There should be only one Galera cluster running across 2 Kubernetes clusters. Creating 2 different Galera clusters will mean that we have a split brain
- Cluster recovery process gets tricky, we cannot control external
Podsand therefore we can't get the sequence number on them to know which is the most advanced node and therefore where to bootstrap the new clusterTo overcome the previous points, I can think of:
- Not allowing writes on the external cluster, this way, we know for sure that the most advanced node will be in the current cluster and we can perform the cluster recovery. However, we still need to restart the
Podsin the external cluster when that happens, this cross-cluster coordination will be tricky if not imposible.- Have 2 different Galera clusters in 2 different Kubernetes clusters and setup replication between them. Writes can only happen on one of the cluster and the initial replication setup can be done by one of the operators, since it is done via cross-cluster SQL statements
Taking into account all my previous points, the latter option seems to be the most reasonable one TBH, and it seems to be a common pattern we can automate:
- https://mariadb.com/kb/en/configuring-mariadb-replication-between-two-mariadb-galera-clusters/
Sorry if this was too long, happy to hear your thoughts!
@mmontes11 is the option "2 different Galera clusters in 2 different Kubernetes clusters and setup replication between them" achievable with the current operator, even if it requires some manual steps and isn’t fully automated?
@laurentiusoica
I performed a quick test as described below. Please note that this is not running at a Production level; I only implemented a basic proof-of-concept.
-
I deployed each Galera Cluster on each Kubernetes cluster.
-
Then, I accessed one of the MariaDB Galera Cluster pods that would act as the slave (e.g., mariadb-cluster-0) and manually configured the replication based on the following document https://mariadb.com/kb/en/configuring-mariadb-replication-between-two-mariadb-galera-clusters/. Of course, the necessary network, gateway, and VIP configurations for the replication had to be set up first.
-
At this point, I was able to confirm that the replication between the clusters is 'working' (functioning).
Finally, I truly hope that native replication functionality between different clusters becomes available as well.