K8SSAND-1789 ⁃ .spec.cassandra.additionalSeeds should allow hostnames
What is missing?
The API docs for additionalSeeds (see here say that hostnames can be used, but doing so results in an error like the following:
ERROR controller.k8ssandracluster Reconciler error {"reconciler group": "k8ssandra.io", "reconciler kind": "K8ssandraCluster", "name": "demo", "namespace": "k8ssandra-operator", "error": "Endpoints \"demo-dc1.1-additional-seed-service\" is invalid: [subsets[0].addresses[0].ip: Invalid value: \"demo-seed-service.k8ssandra.svc.cluster.local\": must be a valid IP address, (e.g. 10.9.8.7 or 2001:db8::ffff), subsets[0].addresses[0].ip: Invalid value: \"demo-seed-service.k8ssandra.svc.cluster.local\": must be a valid IP address]"}
The primary focus when adding support for additionalSeeds was for migrating existing Cassandra cluster running outside of Kubernetes. In these situations DNS usually is not be available.
Cassandra nodes in a K8ssandraCluster are configured with two seeds. Let's say our Cassandra cluster name is test with 3 DCs - dc1, dc2, dc3. The pods for each DC will have two seeds - test-seeds-service and test-<dc-name>-additional-seeds-service. Both are k8s services. Cassandra is configured to use a custom seeds resolver that will do a hostname lookup on the services so it gets the pod IPs.
Kubernetes manages the IPs for test-seeds-service. The operator however manages the IPs for the additional seeds service. It creates the Endpoints object that contains the IP address specified in .spec.cassandra.additionalSeeds. As the error above says, IP addresses must be used.
Why do we need it? Migrating from k8ssandra 1.x is usually done within the same k8s cluster. Migrations would be a lot easier if we could specify the seed service of the k8ssandra 1.x DC; otherwise the process is going to be very error prone since pod IPs can change.
To make it more clear, I would consider something more clear and explicit, maybe along these lines:
spec:
cassandra:
additionalSeeds:
- 1.2.3.4
additionalDatacenters:
- namespace: k8ssandra
name: dc1
additionalDatacenters is a list of ObjectReferences pointing to CassandraDatacenter objects. The operator can take care of resolving the seed service name. The seeds service test-seed-service.k8ssandra.svc.cluster.local would then be added as a seed for each Cassandra node.
Environment
-
K8ssandra Operator version:
**Anything else we need to know?**:1.2.1
┆Issue is synchronized with this Jira Story by Unito
This looks great, though the provided example still shows IP addresses as additionalSeeds. For non-K8ssandra DCs, I am unclear why there is the presumption that DNS resolution to the originating DC(s) would be unavailable, but if this story allows for DNS resolution then the point is moot.
One clarifying question: if additionalDatacenters is provided and resolves to a CassandraDatacenter, would additionalSeeds need to be specified at all? Seems it would become redundant...