vcluster
vcluster copied to clipboard
vcluster not working - chmod kine.sock: no such file or directory
I try to create vcluster with vcluster cmd. It is not working with below message. I can't solve this problem.
time="2021-05-15T13:48:57.625861730Z" level=fatal msg="starting kubernetes: preparing server: creating storage endpoint: chmod kine.sock: no such file or directory"
I believe kine is the SQL shim for etcd. K3s uses it to preserve the cluster state. Are you using k3s?
K3s uses it to preserve the cluster state.
@gabeduke thanks for reply. No host cluster is k8s(1.19.3) using kubeadm. which information do you need?
@kbc8894 thanks for creating this issue! I assume the log message you posted is from the virtual cluster container in the vcluster statefulset correct? What type of filesystem are you using for persistent volumes and your nodes? This issue (https://github.com/k3s-io/k3s/issues/3137) describes the same error for k3s and it seems to be filesystem related
@FabianKramm
I assume the log message you posted is from the virtual cluster container in the vcluster statefulset correct?
yes, right!
What type of filesystem are you using for persistent volumes and your nodes?
NFS is used. nfs client provisoner
@kbc8894 thanks for the info! Would be interesting to know if this error also occurs if you use an emptyDir volume instead of a persistent volume with NFS.
@FabianKramm I tried to create with emptyDir its working fine but when i try to create PV with NFS its shows no resource found.
kind: PersistentVolume
metadata:
name: nfs
spec:
capacity:
storage: 1000Gi
accessModes:
- ReadWriteMany
nfs:
server: SERVER_IP
path: /src/nfs
mountOptions:
- vers=4
we are using vcluster version 0.3.3
. the same YAML is working on the host k8s cluster.
@jnbhavya sorry for the late response, are you trying to create this PV inside the vcluster? Currently vcluster does not support that, but we are working on that (see #102)
Hello,
I'm seeing the exact same issue (chmod kine.sock: no such file or directory
) on an Openshift cluster with vcluster version 0.8.1
.
I installed the
The PVC is using CephFS based storage (not NFS like in the previous comments).
$ cat values.yaml
# https://www.vcluster.com/docs/operator/restricted-hosts
openshift:
enabled: true
# https://www.vcluster.com/docs/operator/external-access#ingress
syncer:
extraArgs:
- --tls-san=example.com
$ vcluster create --debug test -f values.yaml
[info] execute command: helm upgrade test vcluster --repo https://charts.loft.sh --version 0.8.1 --kubeconfig /tmp/4255751651 --namespace goproxy --install --repository-config='' --values /tmp/1729210185 --values values.yaml
[done] √ Successfully created virtual cluster test in namespace goproxy.
- Use 'vcluster connect test --namespace goproxy' to access the virtual cluster
- Use `vcluster connect test --namespace goproxy -- kubectl get ns` to run a command directly within the vcluster
$ oc get pvc -o yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
volume.beta.kubernetes.io/storage-provisioner: cephfs.manila.csi.openstack.org
creationTimestamp: "2022-06-03T06:37:50Z"
finalizers:
- kubernetes.io/pvc-protection
labels:
app: vcluster
release: test
name: data-test-0
namespace: goproxy
resourceVersion: "697495484"
uid: c06e3ca2-4701-4ab0-bd11-d666cbe2b571
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
storageClassName: cephfs
volumeMode: Filesystem
volumeName: pvc-c06e3ca2-4701-4ab0-bd11-d666cbe2b571
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 5Gi
phase: Bound
$ oc get pods -o yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
k8s.v1.cni.cncf.io/network-status: |-
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.76.82.211"
],
"default": true,
"dns": {}
}]
k8s.v1.cni.cncf.io/networks-status: |-
[{
"name": "openshift-sdn",
"interface": "eth0",
"ips": [
"10.76.82.211"
],
"default": true,
"dns": {}
}]
kubernetes.io/limit-ranger: 'LimitRanger plugin set: cpu limit for container vcluster;
cpu limit for container syncer'
openshift.io/scc: restricted
creationTimestamp: "2022-06-03T06:37:50Z"
generateName: test-
labels:
app: vcluster
controller-revision-hash: test-77bf5587f8
release: test
statefulset.kubernetes.io/pod-name: test-0
name: test-0
namespace: goproxy
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: StatefulSet
name: test
uid: ee730ca9-249d-4e9a-8f43-00f1b0f5b16a
resourceVersion: "697499216"
uid: 75ac6b7b-d07f-4481-8e9e-30e308cd331f
spec:
affinity: {}
containers:
- args:
- -c
- /bin/k3s server --write-kubeconfig=/data/k3s-config/kube-config.yaml --data-dir=/data
--disable=traefik,servicelb,metrics-server,local-storage,coredns --disable-network-policy
--disable-agent --disable-cloud-controller --flannel-backend=none --disable-scheduler
--kube-controller-manager-arg=controllers=*,-nodeipam,-nodelifecycle,-persistentvolume-binder,-attachdetach,-persistentvolume-expander,-cloud-node-lifecycle
--kube-apiserver-arg=endpoint-reconciler-type=none --service-cidr=172.30.0.0/16
&& true
command:
- /bin/sh
image: rancher/k3s:v1.22.8-k3s1
imagePullPolicy: IfNotPresent
name: vcluster
resources:
limits:
cpu: "1"
memory: 2Gi
requests:
cpu: 200m
memory: 256Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- KILL
- MKNOD
- SETGID
- SETUID
runAsUser: 1010310000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /data
name: data
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-g9ccs
readOnly: true
- args:
- --name=test
- --service-account=vc-workload-test
- --tls-san=vcluster-cubieserver.app.cern.ch
image: loftsh/vcluster:0.8.1
imagePullPolicy: IfNotPresent
livenessProbe:
failureThreshold: 60
httpGet:
path: /healthz
port: 8443
scheme: HTTPS
initialDelaySeconds: 60
periodSeconds: 2
successThreshold: 1
name: syncer
readinessProbe:
failureThreshold: 60
httpGet:
path: /readyz
port: 8443
scheme: HTTPS
periodSeconds: 2
successThreshold: 1
timeoutSeconds: 1
resources:
limits:
cpu: "1"
memory: 1Gi
requests:
cpu: 100m
memory: 128Mi
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- KILL
- MKNOD
- SETGID
- SETUID
runAsUser: 1010310000
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /manifests/coredns
name: coredns
readOnly: true
- mountPath: /data
name: data
readOnly: true
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-g9ccs
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostname: test-0
imagePullSecrets:
- name: vc-test-dockercfg-9z5jc
nodeName: standard-node-xxx
nodeSelector:
node-role.kubernetes.io/standard: ""
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 1010310000
seLinuxOptions:
seLinuxOptions:
level: s0:c102,c4
serviceAccount: vc-test
serviceAccountName: vc-test
subdomain: test-headless
terminationGracePeriodSeconds: 10
tolerations:
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
- effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
volumes:
- name: data
persistentVolumeClaim:
claimName: data-test-0
- configMap:
defaultMode: 420
name: test-coredns
name: coredns
- name: kube-api-access-g9ccs
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
- configMap:
items:
- key: service-ca.crt
path: service-ca.crt
name: openshift-service-ca.crt
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2022-06-03T06:38:03Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2022-06-03T06:38:03Z"
message: 'containers with unready status: [vcluster syncer]'
reason: ContainersNotReady
status: "False"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2022-06-03T06:38:03Z"
message: 'containers with unready status: [vcluster syncer]'
reason: ContainersNotReady
status: "False"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2022-06-03T06:38:03Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: cri-o://f76a00fc9ec549845b2016b52c6c030f3af351144dc867bcab027e2017451221
image: docker.io/loftsh/vcluster:0.8.1
imageID: docker.io/loftsh/vcluster@sha256:495fc75b50ec1f71a12ed201c15e9621a4073e2db729bc76aa163dab5d70b80e
lastState: {}
name: syncer
ready: false
restartCount: 0
started: true
state:
running:
startedAt: "2022-06-03T06:38:28Z"
- containerID: cri-o://37b507238e774b396f770cc231ef5a12c4b31a7cef3ed205a8b06eeadba81b0a
image: docker.io/rancher/k3s:v1.22.8-k3s1
imageID: docker.io/rancher/k3s@sha256:5a03ee2ab56f7bb051f7aefe979c432ab42624af0d4d3a9ba739c151c684d4a7
lastState:
terminated:
containerID: cri-o://ce7e099aec8d70362a3fa72d97548c27200c4367dc616ab272739c802d99af06
exitCode: 1
finishedAt: "2022-06-03T06:39:16Z"
reason: Error
startedAt: "2022-06-03T06:39:16Z"
name: vcluster
ready: false
restartCount: 4
started: false
state:
terminated:
containerID: cri-o://37b507238e774b396f770cc231ef5a12c4b31a7cef3ed205a8b06eeadba81b0a
exitCode: 1
finishedAt: "2022-06-03T06:40:10Z"
reason: Error
startedAt: "2022-06-03T06:40:09Z"
hostIP: 1.1.1.1
phase: Running
podIP: 10.76.82.211
podIPs:
- ip: 10.76.82.211
qosClass: Burstable
startTime: "2022-06-03T06:38:03Z"
$ oc logs test-0 -c vcluster
time="2022-06-03T06:41:31Z" level=info msg="Starting k3s v1.22.8+k3s1 (21fed356)"
time="2022-06-03T06:41:31Z" level=info msg="Configuring sqlite3 database connection pooling: maxIdleConns=2, maxOpenConns=0, connMaxLifetime=0s"
time="2022-06-03T06:41:31Z" level=info msg="Configuring database table schema and indexes, this may take a moment..."
time="2022-06-03T06:41:31Z" level=info msg="Database tables and indexes are up to date"
time="2022-06-03T06:41:31Z" level=fatal msg="starting kubernetes: preparing server: creating storage endpoint: creating listener: chmod kine.sock: no such file or directory"
$ oc logs test-0 -c syncer
I0603 06:44:29.219844 1 start.go:166] couldn't find virtual cluster kube-config, will retry in 1 seconds
I0603 06:44:30.219876 1 start.go:166] couldn't find virtual cluster kube-config, will retry in 1 seconds
I0603 06:44:31.219801 1 start.go:166] couldn't find virtual cluster kube-config, will retry in 1 seconds
I0603 06:44:32.219963 1 start.go:166] couldn't find virtual cluster kube-config, will retry in 1 seconds
...
Please let me know any other commands which I can run to debug this issue.
@jacksgt thank you for reporting that you are also experiencing this problem and providing detailed information. And apologies for not noticing your comment sooner. We will triage this issue soon. If you resolved the problem in the meantime - please let us know :)
@carlmontanari Have you had any issues with vcluster in your lab setup with NFS?
Yeah, with k3s things will not start, or will start and time out. I've been just using k0s or k8s with no issues though. It's been a while since I looked at that but I think I put all the info I know in #646 so hopefully that is helpful! Otherwise, testing with k0s/k8s distros should be able to confirm that it is the same/similar issue (w/ nfs).
@carlmontanari Thanks!
So since the problem is still present, we will keep this issue open in the backlog. For now, the workaround is to try a different distro.
- maybe https://github.com/loft-sh/vcluster/pull/1320/ can be a fix this issue?
@felipecrs we have some issues running etcd on EFS (https://github.com/loft-sh/vcluster/issues/1342), but in eks-d. We went to this approach as any of the base distros (like k3s or k0s) use sqlite storage, which will not work on NFS (need to dig reference up).
In our case we are able to use vcluster on EFS, but have a huge bill in terms of EFS writes (~100$/mth/etcd) as it seems we keep constant writing (details on ticket)