ceph-csi icon indicating copy to clipboard operation
ceph-csi copied to clipboard

Unclear documentation

Open jz543fm opened this issue 1 year ago • 5 comments

Describe the feature you'd like to have

Clearly/well arranged documentation, actualy it is written in xy pages in xy directories, that is very hard for orientation and it's complicated to understand it in short time period

What is the value to the end user? (why is it a priority?)

It really can save time, increases better readability, one unified documentation

How would the end user gain value from having this feature?

Clear instruction how to use this plugin

How will we know we have a good solution? (acceptance criteria)

  • Feedback, if the documentation is useful, if is clearly written and ask some question from end users if it is unified and so on.

jz543fm avatar Feb 27 '24 12:02 jz543fm

Could you be a little more explicit in what you expect of the documentation? Proposals for restructuring are definitely welcome, ideally send them as pull-requests :tada:

There is https://github.com/ceph/ceph-csi/blob/devel/docs/deploy-cephfs.md#deployment-with-kubernetes which explains how to deploy the CephFS driver on Kubernetes, a same document is available for RBD.

The usage of CSI operations to manage volumes depend on the container platform. Ceph-CSI mostly focuses on Kubernetes with dynamic provisioning of PersistentVolumeClaims, other platforms use different terminology for the volumes. Explaining how to create volumes on different container platforms is out-of-scope for the documentation of the driver.

nixpanic avatar Feb 29 '24 09:02 nixpanic

Yes, but the documentation is quite misleading and not well structured, it is written on multiple different directories and there are not good references between multiple pages, which creates harder readability and harder orientation, that leads in time consumption and decreases success in deploying

I've tried to install it but after so much time spent I could not with usage of kubectl and plain yamls, then I found out I dont know on which page, that I can use install with bash script, that only worked... Even it was last changed 2 years ago

https://docs.ceph.com/en/latest/rbd/rbd-kubernetes/ I saw this doc too but it is not actual

I would expect optimized structure, for example little brief about what end user can find in doc and therefore reference to it, I had problem with permission error, therefore it should be somewhere before u even deploy plugin, you need to define https://github.com/ceph/ceph-csi/blob/devel/docs/capabilities.md

jz543fm avatar Mar 01 '24 14:03 jz543fm

I'm also having a hard time settings things up. In particular, the parts about installing the CephFS plugin with Helm (https://github.com/ceph/ceph-csi/blob/devel/charts/ceph-csi-cephfs/README.md) could be significantly improved I think.

For instance, it is not clear to me what the different config options and ConfigMaps do. These are in the chart's README (https://github.com/ceph/ceph-csi/tree/devel/charts/ceph-csi-cephfs): configMapName: "Name of the configmap which contains cluster configuration" cephConfConfigMapName: "Name of the configmap which contains ceph.conf configuration" csiConfig: "Configuration for the CSI to connect to the cluster"

However values.yaml also contains cephconf, with the comment "This is a sample configmap that helps define a Ceph configuration as required by the CSI plugins", which is not in the README. Also, both configMapName and cephConfConfigMapName, which are documented in the README, are in a section commented as "Variables for 'internal' use please use with caution!"

Some things appear not to be up to date. For instance, the README mentions the configuration parameters nodeplugin.podSecurityPolicy.enabled and provisioner.podSecurityPolicy.enabled, but as far as I can tell, they are not used in any template:

gpothier@tadzim4:/tmp/ceph-csi $ grep -R charts/ceph-csi-cephfs -e podSecurityPolicy
charts/ceph-csi-cephfs/README.md:| `nodeplugin.podSecurityPolicy.enabled`         | If true, create & use [Pod Security Policy resources](https://kubernetes.io/docs/concepts/policy/pod-security-policy/).                              | `false`                                            |
charts/ceph-csi-cephfs/README.md:| `provisioner.podSecurityPolicy.enabled`        | Specifies whether podSecurityPolicy is enabled                                                                                                       | `false`                                            |
charts/ceph-csi-cephfs/README.md:helm install --set configMapName=ceph-csi-config --set provisioner.podSecurityPolicy.enabled=true

I've been following instructions from this external page, which helped me a lot: https://www.pivert.org/ceph-csi-on-kubernetes-1-24/, and I think some of these could be included in the README. I would suggest the following:

Explain that a CephFS volume and subvolume group are needed, and how to create them in the cluster, noting that the subvolume group name is what should be put in csiConfig[x].cephFS.subvolumeGroup and that the volume name is what should be put in storageClass.fsName:

ceph fs volume create myvol
ceph fs subvolumegroup create myvol csi

Explain what to put in the ceph-config ConfigMap (if it is both correct & needed: as stated above, I'm not sure it is, as it appears to have the same information as csiConfig):

kubectl create namespace ceph-csi-cephfs  # If needed

ceph config generate-minimal-conf > /tmp/csi-ceph.conf
kubectl --namespace ceph-csi-cephfs \
	create configmap ceph-config \
	--from-file=ceph.conf=/tmp/csi-ceph.conf

Explain how to create the csi-cephfs-secret secret:

KEY=$(ceph auth get client.admin -f json | jq -r '.[0].key')

kubectl --namespace ceph-csi-cephfs \
	create secret generic csi-cephfs-secret \
	--from-literal=adminID=admin \
	--from-literal=adminKey="${KEY}"

Give an example minimal content for values.yaml:

---
csiConfig:
  - clusterID: <cluster id as returned by ceph config generate-minimal-conf>
    monitors:
      - <list of monitor IPs, as returned by ceph config generate-minimal-conf>
      - ...
      - ...
    cephFS:
      subvolumeGroup: "csi"
      netNamespaceFilePath: "{{ .kubeletDir }}/plugins/{{ .driverName }}/net"


storageClass:
  # Specifies whether the Storage class should be created
  create: true
  name: mysc

  # String representing a Ceph cluster to provision storage from.
  # Should be unique across all Ceph clusters in use for provisioning,
  # cannot be greater than 36 bytes in length, and should remain immutable for
  # the lifetime of the StorageClass in use.
  clusterID: <cluster id as returned by ceph config generate-minimal-conf>
  # (required) CephFS filesystem name into which the volume shall be created
  # eg: fsName: myfs
  fsName: myvol

And finally the Helm install command, which is more or less what is currently in the README:

helm install \
	--namespace ceph-csi-cephfs \
	ceph-csi-cephfs \
	ceph-csi/ceph-csi-cephfs \
	--values values.yaml

After following all these steps, I am finally able to provision PVs, with the subvolume actually created in Ceph and the PVC marked as Bound.

However, I am now stuck with a mount error on pods that try to consume them:

│   Type     Reason       Age                    From     Message                                                                                      │
│   ----     ------       ----                   ----     -------                                                                                      │
│   Warning  FailedMount  18m (x655 over 25h)    kubelet  Unable to attach or mount volumes: unmounted volumes=[mypvc], unattached volumes=[], failed  │
│ to process volumes=[]: timed out waiting for the condition                                                                                           │
│   Warning  FailedMount  3m35s (x745 over 25h)  kubelet  MountVolume.MountDevice failed for volume "pvc-a3c70635-871e-485a-82b0-cc7d423bea41" : rpc e │
│ rror: code = Internal desc = failed to get stat for {{ .kubeletDir }}/plugins/{{ .driverName }}/net stat {{ .kubeletDir }}/plugins/{{ .driverName }} │
│ /net: no such file or directory stderr:

I don't know how to troubleshoot this. I see there are two requirements, but it is not clear how to ensure they are met, or at least check if they are met:

  • "Your Kubernetes cluster must allow privileged pods (i.e. --allow-privileged flag must be set to true for both the API server and the kubelet)". I understand it is my responsibility as the cluster administrator to ensure it is the case, but it would be nice to have a way to check that it is correctly configured.
  • Same with "Moreover, as stated in the mount propagation docs, the Docker daemon of the cluster nodes must allow shared mounts.". The linked docs do not explain how this can be configured.

About the section "Running CephCSI with pod networking" (https://github.com/ceph/ceph-csi/blob/devel/examples/README.md#running-cephcsi-with-pod-networking): I see this is a warning that thinks will not work with "pod networking", but it is not clear if it is something that some clusters have, and I don't know how to check if this is what I have. It would be nice to have instructions to check if I have "pod networking", and how to avoid it.

gpothier avatar Mar 03 '24 00:03 gpothier

When I was deploying without scripts but with plain manifests, there in not also well documented how to implement KMS configuration in K8s cluster and so on, there is only reference to KMS folder without README.md, because I could not run the deployment because some kms secret, without any referencing it in documentation

jz543fm avatar Mar 04 '24 05:03 jz543fm

I finally was able to mount the provisioned volume in my app pod by removing this line from values.yaml:

netNamespaceFilePath: "{{ .kubeletDir }}/plugins/{{ .driverName }}/net"

The netNamespaceFilePath option is given in the example for csiConfig in the chart's values.yaml: https://github.com/ceph/ceph-csi/blob/5298762c4cbab1eec7b2e467908c98c5d1e38e11/charts/ceph-csi-cephfs/values.yaml#L30

This option enables the "pod networking" mode (https://github.com/ceph/ceph-csi/blob/devel/examples/README.md#running-cephcsi-with-pod-networking), which seems to be a somewhat unsupported mode of operation. I therefore suggest removing the netNamespaceFilePath option from the example in values.yaml, or at least clarify what it does.

gpothier avatar Mar 10 '24 20:03 gpothier

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in a week if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Apr 09 '24 21:04 github-actions[bot]

This issue has been automatically closed due to inactivity. Please re-open if this still requires investigation.

github-actions[bot] avatar Apr 17 '24 21:04 github-actions[bot]