Setting admin_nopass to false prevents operator from configuring instances
Configuring authentication on the non-admin port works perfectly with:
authentication:
passwordFromSecret:
name: foo
key: password
However, on inspection of kubectl describe statefulset dragonfly, containers are configured with the following arguments:
Args:
--alsologtostderr
--primary_port_http_enabled=false
--admin_port=9999
--admin_nopass
I confirmed that when authentication is configured, connecting to the non-admin port requires a password, and connecting to the admin port does not. I assumed that overriding admin_nopass would cause the operator to authenticate with the secret above, but it still attempts an unauthenticated connection to the admin port.
2023-11-17T02:39:39Z ERROR Failed to mark pod as master {"controller": "pod", "controllerGroup": "", "controllerKind": "Pod", "Pod": {"name":"dragonfly-0","namespace":"default"}, "namespace": "default", "name": "dragonfly-0", "reconcileID": "<id>", "podName": "dragonfly-0", "error": "error running SLAVE OF NO ONE command: NOAUTH Authentication required."}
github.com/dragonflydb/dragonfly-operator/internal/controller.(*DragonflyInstance).configureReplication
/workspace/internal/controller/dragonfly_instance.go:108
github.com/dragonflydb/dragonfly-operator/internal/controller.(*DfPodLifeCycleReconciler).Reconcile
/workspace/internal/controller/dragonfly_pod_lifecycle_controller.go:92
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:122
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:323
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:274
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:235
2023-11-17T02:39:39Z INFO could not initialize replication. will retry {"controller": "pod", "controllerGroup": "", "controllerKind": "Pod", "Pod": {"name":"dragonfly-0","namespace":"default"}, "namespace": "default", "name": "dragonfly-0", "reconcileID": "ac977a2c-ea01-49f8-8812-574a2d9acc31", "error": "error running SLAVE OF NO ONE command: NOAUTH Authentication required."}
I assume this is just because the username/password pair is not passed on construction of the redis client at internal/controller/dragonfly_instance.go:369. Are there plans to support this/would a pull request supporting this be welcome?
Hi @diffuse - thanks for reaching out
IIUC, the operator uses admin port with admin_npoass so that no auth will be required for the commands invoked by the operator.
Can you please provide the log line from line: https://github.com/dragonflydb/dragonfly-operator/blob/8a3c45967da84f78e93761c12ada41d1f3ecff0d/internal/controller/dragonfly_instance.go#L346 ?
Hello @ashotland, thanks for responding! That line is never logged because configuring the master in replicaOfNoOne fails before it can get to the replicaOf calls below.
I could be missing something, but leaving the admin connection unprotected for operator access means that any compromised containers in the cluster could connect with full db access.
I can take a crack at connecting with the credentials provided in the resources if this is something that would be a welcome addition :)
Hi @diffuse, you are correct that --adminnopass assumes that intra cluster access is trusted.
If this is not the case, https://github.com/dragonflydb/dragonfly-operator/pull/136 makes sense.
Thanks for your contribution we'll try to get it reviewed soon.
+1
Replying to @Pothulapati's comment on the MR here:
I'm sure some users would not like giving the Operator access to all the secrets in the cluster
It can be limited to secrets in the operator's namespace specifically (which is a very common pattern).
It doesn't even have to be configurable per-cluster in the first round, I think.
We can add a toggle but we aren't yet sure how many users actually want this as most users trust applications in their cluster
It is very much the same as not setting any authentication on a database, in particular for the superuser of said database.
I don't believe anyone seriously does that anymore these days.
In particular, imagine someone cluelessly setting .serviceSpec.type: Loadbalancer, thinking they're fine because of the password auth...
and many others who have issues like this probably already have network RBAC around which services can access what.
Requiring a full service mesh (or similar deep k8s networking RBAC) comes at an exceptionally high cost in both resources and complexity.
Until we have more requests from others users, We are not sure if its the right thing to go ahead on this. :/
Hopefully here is the right place to do so. Because at the moment this makes the operator impossible to seriously consider to me.
@Tristan971 Thanks for chiming and adding some valid points.
It can be limited to secrets in the operator's namespace specifically (which is a very common pattern).
But this means the password is now in a different namespace from the DragonflyDB instance which seems confusing? As users when they want a DragonflyDB instance would want it all to be in a single namespace. wdyt? 🤔
In particular, imagine someone cluelessly setting .serviceSpec.type: Loadbalancer, thinking they're fine because of the password auth...
Valid point!
As users when they want a DragonflyDB instance would want it all to be in a single namespace. wdyt?
It is preferrable that the secret can be where the relevant dragonfly is, yes.
If we look at what everyone else does, generally, it boils down to:
- have a cluster-wide ON/OFF flag for the operator
- if the flag is OFF, use a role and monitor only the operator's namespace
- if the flag is ON, use a clusterrole and monitor all namespaces
This leaves the operator with the option of doing per-namespace operator deployments or clusterwide ones with the associated elevated access to secrets if they trust the operator enough.
And I think in general people would (and would be right to) trust a handful of operators more than they do hundreds of pods in their cluster handling user traffic etc.
I was bitten by the same issue after changing my Dragonfly command line so that the admin port would only listen on 127.0.0.1 😅 I've left the port open for now, but being able to secure it somehow would indeed be great. Regular clients use short lived TLS certificates, but this 9999 port with no authentication almost looks like a backdoor next to it ;-)
I had to cobble something together to deal with https://github.com/dragonflydb/dragonfly/discussions/4871 , my "operator" uses kubectl exec to log into the pod and connect to the admin port. Could we imagine this operator doing the same? We'd then be able to bind the admin port to localhost only, and control access through regular k8s RBAC (only the operator would be allowed to exec into Dragonfly pods).