synology-csi
synology-csi copied to clipboard
Support for securityContext for pods
Installing the prometheus operator helm chart with defaults (https://prometheus-community.github.io/helm-charts, kube-prometheus-stack) is by default setting this for the prometheus instance:
securityContext: runAsGroup: 2000 runAsNonRoot: true runAsUser: 1000 fsGroup: 2000
This makes the "prometheus-kube-prometheus-stack-prometheus-0" pod go into a crash-loop with the error in logs: "nable to create mmap-ed active query log"
Changing the prometheusSpec securityContext like this: securityContext: runAsGroup: 0 runAsNonRoot: true runAsUser: 0 fsGroup: 2000 makes it all work. But most likely running with root permissions then on the file system.
This seems to be an issue with the csi implementation where it doesn't support fsGroupSupport or similar. For example longhorn does this with "fsGroupPolicy: ReadWriteOnceWithFSType" which make each volume being examined at mount time to determine if permissions should be recursively applied.
@lvikstro I get the following error even when I speficy runAsUser 0. Are you sure it works? 🤔
securityContext:
runAsGroup: 0
runAsNonRoot: true
runAsUser: 0
fsGroup: 2000
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m52s default-scheduler Successfully assigned monitoring/prometheus-prometheus-kube-prometheus-prometheus-0 to node
Warning Failed 75s (x12 over 3m30s) kubelet Error: container's runAsUser breaks non-root policy (pod: "prometheus-prometheus-kube-prometheus-prometheus-0_monitoring(9e509cb5-e8af-4ac4-8ce0-c96fb6ca19c5)", container: init-config-reloader)
Normal Pulled 60s (x13 over 3m30s) kubelet Container image "quay.io/prometheus-operator/prometheus-config-reloader:v0.56.2" already present on machine
Hi there @lvikstro, I ran into the exact same problem (and I mean down to the error message same). I am using a Kubernetes 1.21 with the newest release of the driver.
The security context does work, since this is not the responsibility of the driver, but Kubernetes, but in order for it to work, I had to configure some things:
- This apparently does not work with btrfs
- fsType in the storage class is deprecated and was replaced with
csi.storage.k8s.io/fstype: ext4
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: normal
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: csi.san.synology.com
# if all params are empty, synology CSI will choose an available location to create volume
parameters:
dsm: "<dsmip>"
location: /volume<n>
csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Delete
allowVolumeExpansion: true
This combination fixed it for me
@schoeppi5
Hello.
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: synology-iscsi-storage
annotations:
storageclass.kubernetes.io/is-default-class: "true"
provisioner: csi.san.synology.com
parameters:
dsm: '192.168.16.240'
location: '/volume1'
csi.storage.k8s.io/fstype: ext4
reclaimPolicy: Retain
allowVolumeExpansion: true
This is my SC spec but still doesn't work for me .-. anything in your mind?
I am also facing this issue. I am using latest version of this repository and postgresql is not able to start because owner of nfs is not set properly (although I have setup securityContext and SC properly)
statefulset definition
apiVersion: apps/v1
kind: StatefulSet
metadata:
creationTimestamp: "2022-06-17T14:55:44Z"
generation: 18
labels:
app.kubernetes.io/component: primary
app.kubernetes.io/instance: vaultarden
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: postgresql
helm.sh/chart: postgresql-10.16.2
name: vaultarden-postgresql
namespace: vaultwarden
resourceVersion: "7462911"
uid: 239fd77b-e13b-4303-b007-431424ce526e
spec:
podManagementPolicy: OrderedReady
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app.kubernetes.io/instance: vaultarden
app.kubernetes.io/name: postgresql
role: primary
serviceName: vaultarden-postgresql-headless
template:
metadata:
creationTimestamp: null
labels:
app.kubernetes.io/component: primary
app.kubernetes.io/instance: vaultarden
app.kubernetes.io/managed-by: Helm
app.kubernetes.io/name: postgresql
helm.sh/chart: postgresql-10.16.2
role: primary
name: vaultarden-postgresql
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- podAffinityTerm:
labelSelector:
matchLabels:
app.kubernetes.io/component: primary
app.kubernetes.io/instance: vaultarden
app.kubernetes.io/name: postgresql
namespaces:
- vaultwarden
topologyKey: kubernetes.io/hostname
weight: 1
automountServiceAccountToken: false
containers:
- env:
- name: BITNAMI_DEBUG
value: "true"
- name: POSTGRESQL_PORT_NUMBER
value: "5432"
- name: POSTGRESQL_VOLUME_DIR
value: /bitnami/postgresql
- name: PGDATA
value: /bitnami/postgresql/data
- name: POSTGRES_POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
key: postgresql-postgres-password
name: vaultarden-postgresql
- name: POSTGRES_USER
value: vaultwarden
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
key: postgresql-password
name: vaultarden-postgresql
- name: POSTGRES_DB
value: vaultwarden
- name: POSTGRESQL_ENABLE_LDAP
value: "no"
- name: POSTGRESQL_ENABLE_TLS
value: "no"
- name: POSTGRESQL_LOG_HOSTNAME
value: "false"
- name: POSTGRESQL_LOG_CONNECTIONS
value: "false"
- name: POSTGRESQL_LOG_DISCONNECTIONS
value: "false"
- name: POSTGRESQL_PGAUDIT_LOG_CATALOG
value: "off"
- name: POSTGRESQL_CLIENT_MIN_MESSAGES
value: error
- name: POSTGRESQL_SHARED_PRELOAD_LIBRARIES
value: pgaudit
image: docker.io/bitnami/postgresql:11.14.0-debian-10-r28
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
command:
- /bin/sh
- -c
- exec pg_isready -U "vaultwarden" -d "dbname=vaultwarden" -h 127.0.0.1
-p 5432
failureThreshold: 6
initialDelaySeconds: 30
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: vaultarden-postgresql
ports:
- containerPort: 5432
name: tcp-postgresql
protocol: TCP
readinessProbe:
exec:
command:
- /bin/sh
- -c
- -e
- |
exec pg_isready -U "vaultwarden" -d "dbname=vaultwarden" -h 127.0.0.1 -p 5432
[ -f /opt/bitnami/postgresql/tmp/.initialized ] || [ -f /bitnami/postgresql/.initialized ]
failureThreshold: 6
initialDelaySeconds: 5
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
resources:
requests:
cpu: 250m
memory: 256Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /dev/shm
name: dshm
- mountPath: /bitnami/postgresql
name: postgresql
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext:
fsGroup: 1001
terminationGracePeriodSeconds: 30
volumes:
- name: postgresql
persistentVolumeClaim:
claimName: test
- emptyDir:
medium: Memory
name: dshm
- emptyDir: {}
name: data
updateStrategy:
type: RollingUpdate
Storage class definition
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
creationTimestamp: "2022-06-21T07:53:58Z"
name: synology-smb-storage
resourceVersion: "7438914"
uid: 19290577-5044-427d-a4a6-5532b83c49bb
parameters:
csi.storage.k8s.io/node-stage-secret-name: cifs-csi-credentials
csi.storage.k8s.io/node-stage-secret-namespace: synology-csi
dsm: 192.168.30.13
fsType: ext4
location: /volume1
protocol: smb
provisioner: csi.san.synology.com
reclaimPolicy: Retain
volumeBindingMode: Immediate
These are logs from postgresql when using this configuration
postgresql 11:04:36.12
postgresql 11:04:36.12 Welcome to the Bitnami postgresql container
postgresql 11:04:36.12 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-postgresql
postgresql 11:04:36.12 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-postgresql/issues
postgresql 11:04:36.13
postgresql 11:04:36.13 DEBUG ==> Configuring libnss_wrapper...
postgresql 11:04:36.14 INFO ==> ** Starting PostgreSQL setup **
postgresql 11:04:36.18 INFO ==> Validating settings in POSTGRESQL_* env vars..
postgresql 11:04:36.18 INFO ==> Loading custom pre-init scripts...
postgresql 11:04:36.19 INFO ==> Initializing PostgreSQL database...
postgresql 11:04:36.19 DEBUG ==> Ensuring expected directories/files exist...
mkdir: cannot create directory ‘/bitnami/postgresql/data’: Permission denied
Anyway, if I change directory to be emptydir (instead of pvc from synology nfs) it works and I can verify that owner is 1001 (which I have set in securityContext)
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.
The database cluster will be initialized with locale "en_US.UTF-8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
fixing permissions on existing directory /bitnami/postgresql/data ... ok
creating subdirectories ... ok
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default timezone ... Etc/UTC
selecting dynamic shared memory implementation ... posix
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
WARNING: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
syncing data to disk ... ok
Success. You can now start the database server using:
/opt/bitnami/postgresql/bin/pg_ctl -D /bitnami/postgresql/data -l logfile start
postgresql 11:09:21.51 INFO ==> Starting PostgreSQL in background...
waiting for server to start....2022-06-21 11:09:21.540 GMT [66] LOG: listening on IPv6 address "::1", port 5432
2022-06-21 11:09:21.540 GMT [66] LOG: listening on IPv4 address "127.0.0.1", port 5432
2022-06-21 11:09:21.543 GMT [66] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-06-21 11:09:21.555 GMT [67] LOG: database system was shut down at 2022-06-21 11:09:21 GMT
2022-06-21 11:09:21.558 GMT [66] LOG: database system is ready to accept connections
done
server started
CREATE DATABASE
postgresql 11:09:21.97 INFO ==> Changing password of postgres
ALTER ROLE
postgresql 11:09:22.01 INFO ==> Creating user vaultwarden
CREATE ROLE
postgresql 11:09:22.03 INFO ==> Granting access to "vaultwarden" to the database "vaultwarden"
GRANT
ALTER DATABASE
postgresql 11:09:22.06 INFO ==> Setting ownership for the 'public' schema database "vaultwarden" to "vaultwarden"
ALTER SCHEMA
postgresql 11:09:22.10 INFO ==> Configuring replication parameters
postgresql 11:09:22.14 INFO ==> Configuring synchronous_replication
postgresql 11:09:22.14 INFO ==> Configuring fsync
postgresql 11:09:22.18 INFO ==> Loading custom scripts...
postgresql 11:09:22.19 INFO ==> Enabling remote connections
postgresql 11:09:22.20 INFO ==> Stopping PostgreSQL...
waiting for server to shut down....2022-06-21 11:09:22.214 GMT [66] LOG: received fast shutdown request
2022-06-21 11:09:22.216 GMT [66] LOG: aborting any active transactions
2022-06-21 11:09:22.220 GMT [66] LOG: background worker "logical replication launcher" (PID 73) exited with exit code 1
2022-06-21 11:09:22.221 GMT [68] LOG: shutting down
2022-06-21 11:09:22.239 GMT [66] LOG: database system is shut down
done
server stopped
postgresql 11:09:22.32 INFO ==> ** PostgreSQL setup finished! **
postgresql 11:09:22.37 INFO ==> ** Starting PostgreSQL **
2022-06-21 11:09:22.393 GMT [1] LOG: pgaudit extension initialized
2022-06-21 11:09:22.395 GMT [1] LOG: listening on IPv4 address "0.0.0.0", port 5432
2022-06-21 11:09:22.395 GMT [1] LOG: listening on IPv6 address "::", port 5432
2022-06-21 11:09:22.398 GMT [1] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-06-21 11:09:22.412 GMT [156] LOG: database system was shut down at 2022-06-21 11:09:22 GMT
2022-06-21 11:09:22.415 GMT [1] LOG: database system is ready to accept connections
Owner of the directory
$ ls -l /bitnami/
total 8
drwxrwsrwx. 3 root 1001 4096 Jun 21 11:09 postgresql
$ ls -l /bitnami/postgresql
total 8
drwx------. 19 1001 1001 4096 Jun 21 11:09 data
$ ls -l /bitnami/postgresql/data
total 176
drwx------. 6 1001 root 4096 Jun 21 11:09 base
drwx------. 2 1001 root 4096 Jun 21 11:10 global
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_commit_ts
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_dynshmem
-rw-------. 1 1001 root 1636 Jun 21 11:09 pg_ident.conf
drwx------. 4 1001 root 4096 Jun 21 11:09 pg_logical
drwx------. 4 1001 root 4096 Jun 21 11:09 pg_multixact
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_notify
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_replslot
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_serial
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_snapshots
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_stat
drwx------. 2 1001 root 4096 Jun 21 11:10 pg_stat_tmp
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_subtrans
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_tblspc
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_twophase
-rw-------. 1 1001 root 3 Jun 21 11:09 PG_VERSION
drwx------. 3 1001 root 4096 Jun 21 11:09 pg_wal
drwx------. 2 1001 root 4096 Jun 21 11:09 pg_xact
-rw-------. 1 1001 root 88 Jun 21 11:09 postgresql.auto.conf
-rw-------. 1 1001 root 249 Jun 21 11:09 postmaster.opts
-rw-------. 1 1001 root 79 Jun 21 11:09 postmaster.pid
Depending on the K8s version you are using, there is a problem with the DelegateFSGroupToCSIDriver feature gate. This is enabled by default in starting with K8s 1.23.
Normally, the kubelet is responsible for "fulfilling" the securityContext
chown and chmod requirements. This feature gate enables the kubelet to delegate this to the csi driver, if it supports it.
The synology csi driver declares that it is able to do that, but just isn't doing it.
The quick workaround for this, is to disable this feature gate and always let the kubelet do that.
The solution for this would be for the csi driver either to not declare this capability, or for it to actually work.
I am also running into this issue with OpenShift 4.11 (based on k8s 1.24). CSI driver provisions and mounts the volume with no issues, but pod instantiation always fails with Permission Denied.
nfs
kubelet isn't doing the chown for NFS volumes. You'll have to use an init container for that
@rblundon Do you have a bit more info? Maybe the storageclass you're using and the podspec
Disabling DelegateFSGroupToCSIDriver worked perfectly for me btw. Thank you so much!
Disabling DelegateFSGroupToCSIDriver worked perfectly for me btw. Thank you so much!
How to disable DelegateFSGroupToCSIDriver on a existing cluster?
https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/
@Ryanznoco Hi, it depends on what Kubernetes distribution you use but you need to look up "feature gate"
https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/
It is not working for me. I added "--feature-gates=DelegateFSGroupToCSIDriver=false" into the api-server manifest, after re-deploy completely, my redis pod still occurs "Permission denied" error. My kubernetes version is 1.23.10.
You need to add the Feature Gate to every kubelet config for your cluster, since this is a kubelet gate, not an api server one. Depending on your installation method, there might be an easier way of changing the kubelet config
Sorry I don't know how to do this. Can I delete the line csi.NodeServiceCapability_RPC_VOLUME_MOUNT_GROUP, in source code and recompile and install csi dirver?
That would be, like, a lot more work than disabling the feature gate.
Please read the whole message, before proceeding
There are a few ways of doing that, manually should work most of the time, other approaches depend on your installation.
Manually
If you have K8s installed locally, on VMs, or any other way, where your installation isn't managed by a cloud provider, you can:
- Go to one of your nodes
- Find the kubelet process (pid)
ps -ef | grep /usr/bin/kubelet
- Find the command line
cat /proc/<pid>/cmdline
- Find the path of the kubelet config. That is the path after
--config=
(That is probably going to be /var/lib/kubelet/config.yaml) - In that file, you can add:
featureGates:
DelegateFSGroupToCSIDriver: false
- Restart the kubelet service (systemctl restart kubelet)
- Congrats, you just disabled a feature gate in the kubelet. Now repeat 5. & .6 on all other worker nodes.
kubeadm
kubeadm keeps the current kubelet config in a configmap
on the cluster in the kube-system namespace, called kubelet-config.
You can find the whole process well documented here. It works pretty much the same as the manual doing.
talos linux
Just putting that one in here, since I am working with that right now:
In your worker configuration, add the feature gates, you want to en/disable like depicted in their docs here.
Others
For other installation methods, you'll have to consult their respective docs on how to disable kubelet feature gates.
But yes, you could probably also remove this capability, recompile, rebuild the image and patch your deployment, but no guaranty on that.
At this point, I might as well open a PR for this issue. We'll see.
Hope this helps you and you get the driver running correctly. Let me now how it went, I'm invested now 😆
This makes sense, but how would it be accomplished on OpenShift?
Sorry, I don't have any experience in OpenShift
@rblundon for OpenShift problems you really SHOULD ask Red Hat
@schoeppi5 Thank you for your help. I solved it by modifying the source code. I also tried modifying the kubeadm configmap, but it still doesn't work. Looking forward to the new release with your PR. @inductor And thank you too.
@Ryanznoco - I will try reinstalling the CSI driver from your REPO. @inductor - I work for RH, but in sales, not engineering. Pretty sure this wouldn't get any traction as it looks to be na issue with the Synology Driver not doing what it is advertising t=it is capable of doing.
The problem here is that this CSI just declare the capability but it does not do anything to actually implement that like for other CSI. See here an example https://github.com/kubernetes-csi/csi-driver-smb/pull/379/files
We should point into the README about this limitation because more and more k8s clusters and bistro implement nowadays Security Context and those days were everything was running as a root likely and hopefully are gone.
DelegateFSGroupToCSIDriver is enabled by default as of 1.26 since the feature is considered as GA.
I also just upgraded to v1.26.1 and kubelet complained wouldn't allow me to disable the feature gate anymore. This is a pretty major bug which means the CSI driver doesn't work on k8s v1.26 or later.
It's a pretty easy fix. @chihyuwu would you mind cutting a new release please?
hi @vaskozl, Of course! Thanks for letting us know about it! A new version without the RPC_VOLUME_MOUNT_GROUP capability will be released soon to make sure that the plugin is compatible with k8s v1.26. We'll definitely put more emphasis on CSI's flexibility and compatibility in the future as well!
A new version without the RPC_VOLUME_MOUNT_GROUP capability will be released soon to make sure that the plugin is compatible with k8s v1.26.
A new update of synology/synology-csi:latest is available now.
The problem here is that this CSI just declare the capability but it does not do anything to actually implement that like for other CSI. See here an example https://github.com/kubernetes-csi/csi-driver-smb/pull/379/files
In the previous version, we have tried to implement securityContext like the example, but still missed something. We'll check and fix it in the future.
@chihyuwu Are you willing to move the image to GitHub instead of Docker Hub? Rate limit can be problematic on some specific environment sharing same outgoing global IP address(es).
When it will be released?
Thanks @schoeppi5 for the detailed description! Adding the K3s instructions for those that might need them until it is resolved:
- add these lines to
/etc/rancher/k3s/config.yaml
:
kubelet-arg:
- feature-gates=DelegateFSGroupToCSIDriver=false
-
systemctl restart k3s
Works for me on K3s v1.25.3.
Works on <=1.25. 1.26 has this flag completely disabled.