kine
kine copied to clipboard
CockroachDB causes problems with K3s using Postgres driver
I'm able to connect to CockroachDB with K3s and Kine, however K3s will not work with CockroachDB. I'm not sure what I can provide here beside output.
I do get a lot of RBAC errors like this:
heduler" cannot list resource "storageclasses" in API group "storage.k8s.io" at the cluster scope
Jul 11 10:15:39 virt-0 k3s[14160]: E0711 10:15:39.312697 14160 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:kube-scheduler" cannot list resource "statefulsets" in API group "apps" at the cluster scope
Jul 11 10:15:39 virt-0 k3s[14160]: E0711 10:15:39.312887 14160 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Node: nodes is forbidden: User "system:kube-scheduler" cannot list resource "nodes" in API group "" at the cluster scope
Jul 11 10:15:39 virt-0 k3s[14160]: E0711 10:15:39.325938 14160 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.PersistentVolume: persistentvolumes is forbidden: User "system:kube-scheduler" cannot list resource "persistentvolumes" in API group "" at the cluster scope
Jul 11 10:15:39 virt-0 k3s[14160]: E0711 10:15:39.333258 14160 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.ReplicaSet: replicasets.apps is forbidden: User "system:kube-scheduler" cannot list resource "replicasets" in API group "apps" at the cluster scope
Jul 11 10:15:39 virt-0 k3s[14160]: E0711 10:15:39.342171 14160 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.CSINode: csinodes.storage.k8s.io is forbidden: User "system:kube-scheduler" cannot list resource "csinodes" in API group "storage.k8s.io" at the cluster scope
Jul 11 10:15:39 virt-0 k3s[14160]: E0711 10:15:39.356777 14160 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.Service: services is forbidden: User "system:kube-scheduler" cannot list resource "services" in API group "" at the cluster scope
Jul 11 10:15:39 virt-0 k3s[14160]: E0711 10:15:39.362120 14160 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1beta1.PodDisruptionBudget: poddisruptionbudgets.policy is forbidden: User "system:kube-scheduler" cannot list resource "poddisruptionbudgets" in API group "policy" at the cluster scope
Jul 11 10:15:39 virt-0 k3s[14160]: E0711 10:15:39.367089 14160 reflector.go:153] k8s.io/kubernetes/cmd/kube-scheduler/app/server.go:246: Failed to list *v1.Pod: pods is forbidden: User "system:kube-scheduler" cannot list resource "pods" in API group "" at the cluster scope
Jul 11 10:15:39 virt-0 k3s[14160]: E0711 10:15:39.387236 14160 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.PersistentVolumeClaim: persistentvolumeclaims is forbidden: User "system:kube-scheduler" cannot list resource "persistentvolumeclaims" in API group "" at the cluster scope
Jul 11 10:15:39 virt-0 k3s[14160]: E0711 10:15:39.391674 14160 reflector.go:153] k8s.io/client-go/informers/factory.go:135: Failed to list *v1.ReplicationController: replicationcontrollers is forbidden: User "system:kube-scheduler" cannot list resource "replicationcontrollers" in API group "" at the cluster scope
After a while, I get some pq errors:
Jul 11 10:16:13 virt-0 k3s[14160]: time="2020-07-11T10:16:13.396427502-04:00" level=error msg="error while range on /registry/deployments /registry/deployments: pq: internal error while retrieving user account"
Jul 11 10:16:17 virt-0 k3s[14160]: time="2020-07-11T10:16:17.834086756-04:00" level=error msg="error while range on /registry/configmaps/kube-system/k3s : pq: internal error while retrieving user account"
Jul 11 10:16:17 virt-0 k3s[14160]: time="2020-07-11T10:16:17.834596643-04:00" level=error msg="error while range on /registry/ranges/servicenodeports : pq: internal error while retrieving user account"
Jul 11 10:16:17 virt-0 k3s[14160]: time="2020-07-11T10:16:17.834881953-04:00" level=error msg="error while range on /registry/namespaces/default : pq: internal error while retrieving user account"
Jul 11 10:16:17 virt-0 k3s[14160]: E0711 10:16:17.836634 14160 status.go:71] apiserver received an error that is not an metav1.Status: &status.statusError{Code:2, Message:"pq: internal error while retrieving user account"
, Details:[]*any.Any(nil), XXX_NoUnkeyedLiteral:struct {}{}, XXX_unrecognized:[]uint8(nil), XXX_sizecache:0}
Jul 11 10:16:17 virt-0 k3s[14160]: time="2020-07-11T10:16:17.835201049-04:00" level=error msg="error while range on /registry/ranges/serviceips : pq: internal error while retrieving user account"
Jul 11 10:16:17 virt-0 k3s[14160]: E0711 10:16:17.838544 14160 repair.go:100] unable to refresh the service IP block: rpc error: code = Unknown desc = pq: internal error while retrieving user account
Jul 11 10:16:17 virt-0 k3s[14160]: E0711 10:16:17.837511 14160 repair.go:73] unable to refresh the port allocations: rpc error: code = Unknown desc = pq: internal error while retrieving user account
Jul 11 10:16:17 virt-0 k3s[14160]: E0711 10:16:17.838010 14160 status.go:71] apiserver received an error that is not an metav1.Status: &status.statusError{Code:2, Message:"pq: internal error while retrieving user account"
, Details:[]*any.Any(nil), XXX_NoUnkeyedLiteral:struct {}{}, XXX_unrecognized:[]uint8(nil), XXX_sizecache:0}
Jul 11 10:16:17 virt-0 k3s[14160]: E0711 10:16:17.840814 14160 leaderelection.go:331] error retrieving resource lock kube-system/k3s: rpc error: code = Unknown desc = pq: internal error while retrieving user account
Additionally, using certificate auth, K3s will eventually restart waiting for some CRDs to complete, I'm not sure that is specific to CockroachDB though.
I had an issue running CockroachDB as well that if memory serves I had similar errors. The problem I had was related to the ID field, where the SERIAL type being specified is only a pseudo data type provided for compatibility with postgres in crdb (rather than sequential it ends up being more of a UUID). I think what I ended up doing was adding this to the connection string:
experimental_serial_normalization=sql_sequence
Some more details here: https://www.cockroachlabs.com/docs/stable/serial.html https://www.cockroachlabs.com/docs/stable/experimental-features.html#session-variables
FYI, the current way of setting sql_sequence
is SET CLUSTER SETTING sql.defaults.serial_normalization=2;
using admin account and it is a cluster-wide setting.
This setting of sql.defaults.serial_normalization
works perfectly and now I have an up and running k3s cluster. It solved my issue as mentioned in k3s-io/k3s#2613 although the phenomenon is a little bit different from that in this issue (I do not have any pg internal errors). The documentation is now at https://www.cockroachlabs.com/docs/v20.2/cluster-settings.html .
To close this issue, you would probably want to add this to k3s or kine documentation as it is no longer an experimental feature for CockroachDB anymore. Or you may want to support non-sequential ids for CockroachDB as a connection driver. CockroachDB can be a great option for the HA database for k3s IMHO.
We don't have a CockroachDB specific driver. Technically it's not supported - we've only tested with postgresql itself, not any of the various projects that offer a compatible interface.