mayastor
mayastor copied to clipboard
PV isn't created when deploying test application from docs.
The test application will not deploy, waits in pending. Please help, I am so close to getting this working.
Cluster Setup: worker 1-3
- these do have the mayastor engine lable added
- have mod nvme-tcp enabled
- have hugepages set
- these do not have physical disks for the storage cluster
storage1-3
- these do have the mayastor engine lable added
- have mod nvme-tcp enabled
- have hugepages set
- these do have physical disks for the storage cluster
MSP's created:
Storage Class created:
apiVersion: storage.k8s.io/v1
metadata:
name: mayastor-1
parameters:
fsType: xfs
repl: '1'
protocol: 'nvmf'
ioTimeout: '60'
local: 'false' <------------- this is deliberate. I need the pod requiring storage to be schedulable on any node.
provisioner: io.openebs.csi-mayastor
Then I try to provision the test application from the docs. the PVC is created but stays in pending state with the following error in the cis controller pod
W0830 15:13:24.748827 1 topology.go:321] No topology keys found on any node
W0830 15:13:24.748840 1 controller.go:958] Retrying syncing claim "924eb8c2-d999-4aed-b601-d63cd9d5bdcb", failure 7
E0830 15:13:24.748852 1 controller.go:981] error syncing claim "924eb8c2-d999-4aed-b601-d63cd9d5bdcb": failed to provision volume with StorageClass "mayastor-1": error generating accessibility requirements: no available topology found
I0830 15:13:24.748863 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"ms-volume-claim", UID:"924eb8c2-d999-4aed-b601-d63cd9d5bdcb", APIVersion:"v1", ResourceVersion:"234587", FieldPath:""}): type: 'Normal' reason: 'Provisioning' External provisioner is provisioning volume for claim "default/ms-volume-claim"
I0830 15:13:24.748871 1 event.go:282] Event(v1.ObjectReference{Kind:"PersistentVolumeClaim", Namespace:"default", Name:"ms-volume-claim", UID:"924eb8c2-d999-4aed-b601-d63cd9d5bdcb", APIVersion:"v1", ResourceVersion:"234587", FieldPath:""}): type: 'Warning' reason: 'ProvisioningFailed' failed to provision volume with StorageClass "mayastor-1": error generating accessibility requirements: no available topology found
The test pod also stays in pending because the pvc doesn't start. No PV is ever created. I tried to manually create a pv and the pvc did bind, however the pod had an error. AttachVolume.Attach failed for volume "pv0001" : CSINode storage1 does not contain driver io.openebs.csi-mayastor
Hi, @Daxcor69 can we take a look at the logs of csi-node
pods on the storage1
node? Seems like the csi-node
pod has not come up successfully on that node.
I0829 17:13:08.332189 1 main.go:113] Version: v2.1.0-0-g80d42f24 I0829 17:13:08.332607 1 connection.go:153] Connecting to unix:///csi/csi.sock I0829 17:13:08.350705 1 node_register.go:52] Starting Registration Server at: /registration/io.openebs.csi-mayastor-reg.sock I0829 17:13:08.351062 1 node_register.go:61] Registration Server started at: /registration/io.openebs.csi-mayastor-reg.sock I0829 17:13:08.351419 1 node_register.go:83] Skipping healthz server because HTTP endpoint is set to: ""
[2022-08-29T17:13:06Z INFO mayastor_csi] Removed stale CSI socket /csi/csi.sock [2022-08-29T17:13:06Z INFO mayastor_csi] CSI plugin bound to /csi/csi.sock [2022-08-29T17:13:06Z INFO mayastor_csi::nodeplugin_grpc] Mayastor node plugin gRPC server configured at address 10.0.1.7:10199 [2022-08-29T17:13:08Z DEBUG mayastor_csi::identity] GetPluginInfo request (io.openebs.csi-mayastor:0.2)
That is all that is there in the two containers on storage 1
Can we see the csinode object on storage1, i.e kubectl get csinode storage1 -oyaml
?
apiVersion: storage.k8s.io/v1 kind: CSINode metadata: annotations: storage.alpha.kubernetes.io/migrated-plugins: kubernetes.io/aws-ebs,kubernetes.io/azure-disk,kubernetes.io/azure-file,kubernetes.io/cinder,kubernetes.io/gce-pd creationTimestamp: "2022-08-29T14:34:41Z" name: storage1 ownerReferences:
- apiVersion: v1 kind: Node name: storage1 uid: 58f13f19-8d6c-46cc-8cb8-47fc9dc10ccf resourceVersion: "382" uid: 2f8d4689-745b-4e63-bc42-bca1aeeaa2b2 spec: drivers: null
Seems like the mayastor csi driver is not registered! Can you restart the csi-node pod on storage1 and check if that changes anything in the csinode object?
i restarted all six csi nodes and no change.
I restarted all pod in the mayastor name space. All pods come up running and healthy. the driver is still null on all csi
I have totally deleted the mayastor from the k8s cluster. I did a complete reinstall. etcd-2 is still having issues it can't find the other members. i have created the msp's and they all show online. When I go and run the command requested above I still get driver: null
on the out put of the csi node object.
the error from your test application is:
failed to provision volume with StorageClass "mayastor-1": error generating accessibility requirements: no available topology found
Env: Unbuntu 22.04.1 minimal install on amd64 hardware. 8T sata drives for storage use on /dev/sdb not mounted, formated or partioned. On storage node 1-3 I am running k0s kubernetes distribution on k8s 1.24.2.
No other applications or processes are running in the cluster or on the hosts.
The host has two networks. eth0 - public and eth1 - private which is using 10.0.1.0/24 for host. i have a firewall in place that blocks all traffic on the public network, and allows 10.0.0.0/8 to any port on any system. This should cover all the networks. K0s, was told to install on the pubic network with the following cidrs 10.96.0.0/16 and 10.244.0.0/12.
Deployment Goal: Storage nodes 1-3 each have 8 T sata drives that will be used for the cluster storage requirements. Worker nodes 1-3 will have work load pods that will require storage from mayastore. The storeage nodes will also have work loads on them as well.
The local flag will be set to :false so that all nodes have access to the storage resources, no matter where the pod will be scheduled.
I am happy to to provide any other logs or configs that you require. I want this to work, I don't want to give up.
For etcd: Are you using the non-prod example yaml files for etcd? If so, did you delete the data from the nodes?
For 1.0.2 I think the local flag might be broken as the target must always live on a storage node. A WA could be to label worker nodes with the io engine label (but don't create any pools there).
In the develop branch we're in the process of removing local altogether and let the targets run on any node as long as it''s got the engine labels, decoupling them from the application nodes.
ok, well that was an important bit of information. Is this the release that is coming in late sept?
This may be solved to set --feature-gates=Topology=false
in the csi-provisioner when I have the same problems.
- yaml file:
csi-deployment.yaml
---
# Source: mayastor-control-plane/templates/csi-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: csi-controller
namespace: mayastor
labels:
app: csi-controller
spec:
replicas: 1
selector:
matchLabels:
app: csi-controller
template:
metadata:
labels:
app: csi-controller
spec:
hostNetwork: true
serviceAccount: mayastor-service-account
dnsPolicy: ClusterFirstWithHostNet
imagePullSecrets:
- name: regcred
initContainers:
- command:
- sh
- -c
- trap "exit 1" TERM; until nc -vz rest 8081; do echo "Waiting for REST API endpoint
to become available"; sleep 1; done;
image: busybox:latest
name: rest-probe
containers:
- name: csi-provisioner
image: k8s.gcr.io/sig-storage/csi-provisioner:v2.2.1
args:
- "--v=2"
- "--csi-address=$(ADDRESS)"
- "--feature-gates=Topology=false"
- "--strict-topology=false"
- "--default-fstype=ext4"
env:
- name: ADDRESS
value: /var/lib/csi/sockets/pluginproxy/csi.sock
imagePullPolicy: "IfNotPresent"
volumeMounts:
- name: socket-dir
mountPath: /var/lib/csi/sockets/pluginproxy/
- name: csi-attacher
image: k8s.gcr.io/sig-storage/csi-attacher:v3.2.1
args:
- "--v=2"
- "--csi-address=$(ADDRESS)"
env:
- name: ADDRESS
value: /var/lib/csi/sockets/pluginproxy/csi.sock
imagePullPolicy: "IfNotPresent"
volumeMounts:
- name: socket-dir
mountPath: /var/lib/csi/sockets/pluginproxy/
- name: csi-controller
resources:
limits:
cpu: 32m
memory: 128Mi
requests:
cpu: 16m
memory: 64Mi
image: mayadata/mcp-csi-controller:v1.0.3
imagePullPolicy: IfNotPresent
args:
- "--csi-socket=/var/lib/csi/sockets/pluginproxy/csi.sock"
- "--rest-endpoint=http://rest:8081"
env:
- name: RUST_LOG
value: info
volumeMounts:
- name: socket-dir
mountPath: /var/lib/csi/sockets/pluginproxy/
volumes:
- name: socket-dir
emptyDir:
@Daxcor69 were you able to try the "next release"?
No I moved on to a different solution.