vcluster icon indicating copy to clipboard operation
vcluster copied to clipboard

vcluster does not start in limited RKE cluster

Open MShekow opened this issue 4 years ago • 13 comments

I got a restricted namespace in our internal RKE cluster managed by Rancher. However, vcluster won't start up. I have no idea what the concrete reason is, given that the log contains a massive output.

Things seem to start going wrong with this log entry: cluster_authentication_trust_controller.go:493] kube-system/extension-apiserver-authentication failed with : Internal error occurred: resource quota evaluation timed out

But probably the attached log file will indicate the underlying reason better. vcluster1.log

The syncer log is very short:

I0629 13:25:32.393511       1 main.go:223] Using physical cluster at https://10.43.0.1:443
I0629 13:25:32.575521       1 main.go:254] Can connect to virtual cluster with version v1.20.4+k3s1
F0629 13:25:32.587987       1 main.go:138] register controllers: register secrets indices: no matches for kind "Ingress" in version "networking.k8s.io/v1beta1"```

Any ideas?

MShekow avatar Jun 29 '21 13:06 MShekow

@MShekow thanks for creating this issue! Looks like ingresses is not available in your cluster, can you try to create a values.yaml with:

syncer:
  extraArgs: ["--disable-sync-resources=ingresses"]

and then create the vcluster with vcluster create ... -f values.yaml and check if the same error occurs?

FabianKramm avatar Jun 29 '21 13:06 FabianKramm

Hmm, I do have another sample application deployed to that namespace using an Ingress, which starts like this, and it works fine:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
...

Notice the omission of beta (which is present in the syncer output log). So maybe it's just about the API version?

Anyways, after applying the custom values.yaml I get the following logs: syncer.log vcluster-0.log

MShekow avatar Jun 29 '21 15:06 MShekow

@MShekow Yes the problem will be the beta, the logs look better now, is the vcluster reachable? If not, it looks like the persistent storage in your cluster might have a problem you can deploy vcluster without persistent storage with:

syncer:
  extraArgs: ["--disable-sync-resources=ingresses"]
storage:
  persistence: false

And then create the cluster with vcluster create ... -f values.yaml

FabianKramm avatar Jun 30 '21 08:06 FabianKramm

First of all, I had to deactivate the liveness probe of the vcluster's StatefulSet as otherwise the containers would have been killed constantly (and the readiness probe returns a 403 HTTP status code). I was able to connect to the vcluster, but kubectl commands (using the vcluster kube config) took very long to complete (e.g. listing namespaces took about 45 seconds). I justed wanted to verify that again, but since my last attempt yesterday, vcluster has magically recovered (readiness probe is working), and vcluster commands (such as getting namespaces) work swiftly. There are still errors of the database locked type, though.

I then removed the vcluster and installed it again, this time with disabled persistence, as you suggested. The start up was very fast now, all probes are working right away, connecting to the vcluster works, and kubectl commands (using vcluster kube config) are very swift, too. The coredns container also started (which was not, in the previous setup).

Where does this leave me? Can the persistence be safely disabled, or do you need it? The underlying storage is NetApp (ext4 volumes) provisioned by Trident.

MShekow avatar Jun 30 '21 11:06 MShekow

@MShekow thanks for the update! Yes we saw this type of behaviour before if the persistent volume that is used is very slow. By default, vcluster uses a local sqlite database to save cluster state that is persisted on a persistent volume. If you disable persistence it will use the container overlay storage instead, which is usually really fast, which is why it is working correctly.

Disabling persistence works for test purposes, but the problem is that as soon as the pod restarts all information might be lost as an empty dir is bound to a specific node, so as soon as the pod gets somehow rescheduled or upgraded the vcluster will be broken. Besides disabling persistency you can also use an external data store like a mysql, postgresql or etcd database which might be a good idea in your case. We have a guide in the docs for that.

Another way would be to add a storage class with a different faster type of storage to the host Kubernetes cluster which then can be used by vcluster instead of the NetApp storage.

FabianKramm avatar Jun 30 '21 11:06 FabianKramm

Thanks for your response, and the hints for using an external database.

I ran the dbench benchmark, getting the following results:

==================
= Dbench Summary =
==================
Random Read/Write IOPS: 3833/11.1k. BW: 157MiB/s / 166MiB/s
Average Latency (usec) Read/Write: 12336.37/1839.31
Sequential Read/Write: 346MiB/s / 197MiB/s
Mixed Random Read/Write IOPS: 2595/866

The numbers are not super impressive, but not too bad either. How come SQlite should perform this poorly?

MShekow avatar Jun 30 '21 14:06 MShekow

@MShekow Yeah to be honest I'm not exactly sure why that is not enough, thats maybe more a k3s problem, but are databases like mysql or postgresql working normally in your cluster?

FabianKramm avatar Jul 01 '21 11:07 FabianKramm

I haven't used relational databases extensively in that particular cluster yet.

I tried a Postgresql setup, using the guide of the vcluster manual. I used the bitnami Postgres Helm chart, and set MAX_CONNECTIONS to 100. Still, the log of the vcluster is spammed with error messages of this form: level=error msg="error while range on /registry/health : pq: sorry, too many clients already" How many connections do you need?

I also observed that the coredns container starts with an error, notably: MountVolume.SetUp failed for volume "config-volume" : configmap references non-existent config key: NodeHosts, and its readiness probe is failing.

The logs are: coredns-postgres.log syncer-postgres.log vcluster-postgres.log

I'm also wondering whether to move the Ingress-sync problem into a separate ticket (omission of beta)? The inability to sync ingresses is a show-stopper to using vcluster to begin with.

MShekow avatar Jul 19 '21 13:07 MShekow

@MShekow thanks for the additional information! Seems like the database is really slow (at least in the beginning) based on the k3s logs, which might trigger the too many clients problem as its trying to recreate the postgres clients, but I'm not exactly sure how k3s handles its internal connection pool. After a while all the logs seem to look correct, is the vcluster reachable then?

Regarding the ingresses we will update this soon anyways as in k8s v1.22 the ingress v1beta1 version will not work anymore, but I'm still not sure why your host cluster is missing the v1beta1 ingress version since that should be included in all current k8s versions by default

FabianKramm avatar Jul 19 '21 14:07 FabianKramm

is the vcluster reachable then?

Yes, it seems to work fine.

Regarding the ingresses we will update this soon anyways as in k8s v1.22 the ingress v1beta1 version will not work anymore, but I'm still not sure why your host cluster is missing the v1beta1 ingress version since that should be included in all current k8s versions by default

I am not sure either. Looking forward to a new vcluster release.

MShekow avatar Jul 20 '21 07:07 MShekow

Considering the postgres client issue (pq: sorry, too many clients already): in this PR there is apparently a fix where you can call k3s server with an argument such as --datastore-max-open-connections=10. However, when I try this with vcluster, it does not work:

I add the following segment to my custom-values.yaml:

extraArgs:
    - --service-cidr=10.96.0.0/12
    - --datastore-max-open-connections=10

But k3s won't come up. The error is: Incorrect Usage: flag provided but not defined: -datastore-max-open-connections

MShekow avatar Jul 27 '21 07:07 MShekow

@MShekow thanks for the information! As mentioned in the comments in the k3s issue, k3s currently does not yet support this flag and it would need to be added first, thats why you are seeing this issue. But looks like they are working on adding this, so I guess it should come soon.

FabianKramm avatar Jul 27 '21 11:07 FabianKramm

@MShekow @FabianKramm Had issue with postgres as k3s kine backend https://github.com/k3s-io/kine/issues/63 Was better with mysql backend.

Also had issue with max file descriptor (not sure how it maps in vcluster context though) https://github.com/orange-cloudfoundry/k3s-boshrelease/issues/49

poblin-orange avatar Aug 22 '21 10:08 poblin-orange

k3s maintainers decided to not expose the database connection flags, I don't think we can influence or work around that so I'll close this issue, but feel free to comment or create a new issue if you identify that there is something that we can do on the vcluster side.

Also, we have the option to install vcluster with k8s distro, which uses etcd. See docs here - https://www.vcluster.com/docs/operator/other-distributions

matskiv avatar Nov 02 '22 18:11 matskiv