Expanding PVC Volume Template Results in Data Loss
When trying to expand the PVC volume template the operator will delete/re-create the PVC volumes instead of just resizing them. We are using Rook-Ceph as the storage provider and have successfully resized PVCs without delete/re-create. We can also manually edit the PVC itself and it will expand. We are using version 0.22.2 of the operator. I've reproduced it in multiple clusters.
We have tried it without the storageManagement options as well and it just results in a loop where the operator will continually try to delete/re-create the PVCs
storageManagement:
provisioner: Operator
reclaimPolicy: Retain
---
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "clickhouse"
spec:
defaults:
templates:
dataVolumeClaimTemplate: default
podTemplate: clickhouse:23.7.1.2470-alpine
storageManagement:
provisioner: Operator
reclaimPolicy: Retain
configuration:
settings:
# to allow scrape metrics via embedded prometheus protocol
prometheus/endpoint: /metrics
prometheus/port: 8888
prometheus/metrics: true
prometheus/events: true
prometheus/asynchronous_metrics: true
zookeeper:
nodes:
- host: clickhouse-keeper.clickhouse.svc.cluster.local
users:
default/networks/ip: "::/0"
default/password: password
default/profile: default
# use cluster Pod CIDR for more security
backup/networks/ip: 0.0.0.0/0
# PASSWORD=backup_password; echo "$PASSWORD"; echo -n "$PASSWORD" | sha256sum | tr -d '-'
backup/password_sha256_hex: eb94c11d77f46a0290ba8c4fca1a7fd315b72e1e6c83146e42117c568cc3ea4d
clusters:
- name: replicated
layout:
shardsCount: 1
replicasCount: 3
files:
config.xml: |
<?xml version="1.0"?>
<yandex>
<remote_servers>
<!-- Test only shard config for testing distributed storage -->
<ch_cluster>
<shard>
<internal_replication>True</internal_replication>
<replica>
<host>chi-clickhouse-replicated-0-0</host>
<port>9000</port>
<secure>0</secure>
</replica>
<replica>
<host>chi-clickhouse-replicated-0-1</host>
<port>9000</port>
<secure>0</secure>
</replica>
<replica>
<host>chi-clickhouse-replicated-0-2</host>
<port>9000</port>
<secure>0</secure>
</replica>
</shard>
</ch_cluster>
</remote_servers>
<!-- If element has 'incl' attribute, then for it's value will be used corresponding substitution from another file.
By default, path to file with substitutions is /etc/metrika.xml. It could be changed in config in 'include_from' element.
Values for substitutions are specified in /clickhouse/name_of_substitution elements in that file.
-->
<!-- ZooKeeper is used to store metadata about replicas, when using Replicated tables.
Optional. If you don't use replicated tables, you could omit that.
See https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/
-->
<zookeeper>
<node>
<host>clickhouse-keeper.clickhouse.svc.cluster.local</host>
<port>2181</port>
<secure>0</secure>
</node>
</zookeeper>
<!--
OpenTelemetry log contains OpenTelemetry trace spans.
-->
<opentelemetry_span_log>
<!--
The default table creation code is insufficient, this <engine> spec
is a workaround. There is no 'event_time' for this log, but two times,
start and finish. It is sorted by finish time, to avoid inserting
data too far away in the past (probably we can sometimes insert a span
that is seconds earlier than the last span in the table, due to a race
between several spans inserted in parallel). This gives the spans a
global order that we can use to e.g. retry insertion into some external
system.
-->
<engine>
engine MergeTree
partition by toYYYYMM(finish_date)
order by (finish_date, finish_time_us, trace_id)
</engine>
<database>system</database>
<table>opentelemetry_span_log</table>
<flush_interval_milliseconds>7500</flush_interval_milliseconds>
</opentelemetry_span_log>
</yandex>
templates:
volumeClaimTemplates:
- name: default
spec:
accessModes:
- ReadWriteOnce
reclaimPolicy: Retain
resources:
requests:
storage: 55Gi
podTemplates:
- name: clickhouse:23.7.1.2470-alpine
metadata:
annotations:
prometheus.io/scrape: 'true'
prometheus.io/port: '8888'
prometheus.io/path: '/metrics'
# need separate prometheus scrape config, look to https://github.com/prometheus/prometheus/issues/3756
clickhouse.backup/scrape: 'true'
clickhouse.backup/port: '7171'
clickhouse.backup/path: '/metrics'
spec:
containers:
- name: clickhouse-pod
image: clickhouse-server:23.7.1.2470-alpine
- name: clickhouse-backup
image: clickhouse-backup:latest
imagePullPolicy: Always
command:
- bash
- -xc
- "/bin/clickhouse-backup server"
env:
- name: CLICKHOUSE_PASSWORD
value: password
- name: LOG_LEVEL
value: "debug"
- name: ALLOW_EMPTY_BACKUPS
value: "true"
- name: API_LISTEN
value: "0.0.0.0:7171"
# INSERT INTO system.backup_actions to execute backup
- name: API_CREATE_INTEGRATION_TABLES
value: "true"
- name: BACKUPS_TO_KEEP_REMOTE
value: "3"
# change it for production S3
- name: REMOTE_STORAGE
value: "s3"
- name: S3_ACL
value: "private"
- name: S3_ENDPOINT
value: https://minio
- name: S3_BUCKET
value: clickhouse-backups
# {shard} macro defined by clickhouse-operator
- name: S3_PATH
value: backup/shard-{shard}
- name: S3_ACCESS_KEY
value: clickhouse_backups_rw
- name: S3_DISABLE_CERT_VERIFICATION
value: "true"
- name: S3_SECRET_KEY
value: password
- name: S3_FORCE_PATH_STYLE
value: "true"
ports:
- name: backup-rest
containerPort: 7171
Thanks. Would it be possible to attach the operator log as a file to this case? I would like to see if there is an issue with operator reconciliation. If you can access rook logs, please attach those as well.
@tman5 , could you show your storage classes?
kubectl get storageclasses -o wide
And it would be useful to see one of PVCs created by an operator.
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
ceph-bucket rook-ceph.ceph.rook.io/bucket Delete Immediate false 208d
ceph-filesystem rook-ceph.cephfs.csi.ceph.com Delete Immediate true 208d
rook-ceph-block (default) rook-ceph.rbd.csi.ceph.com Delete Immediate true 208d
sc-smb-mssql-database-repos smb.csi.k8s.io Retain Immediate false 182d
sc-smb-mssql-deploy-scripts smb.csi.k8s.io Retain Immediate false 182d
sc-smb-mssql-wss smb.csi.k8s.io Retain Immediate false 182d
This is one of the PVCs that will perpetually be in a terminating state:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
volume.beta.kubernetes.io/storage-provisioner: rook-ceph.rbd.csi.ceph.com
volume.kubernetes.io/storage-provisioner: rook-ceph.rbd.csi.ceph.com
creationTimestamp: "2024-04-02T12:05:28Z"
deletionGracePeriodSeconds: 0
deletionTimestamp: "2024-04-02T12:05:34Z"
finalizers:
- kubernetes.io/pvc-protection
labels:
argocd.argoproj.io/instance: featbit-clickhouse-dev2
clickhouse.altinity.com/app: chop
clickhouse.altinity.com/chi: clickhouse
clickhouse.altinity.com/cluster: replicated
clickhouse.altinity.com/namespace: clark-developer-featbit
clickhouse.altinity.com/object-version: 241ccf05924775f258c440aecb86eecc549bb3ce
clickhouse.altinity.com/reclaimPolicy: Retain
clickhouse.altinity.com/replica: "0"
clickhouse.altinity.com/shard: "0"
name: default-chi-clickhouse-replicated-0-0-0
namespace: clark-developer-featbit
resourceVersion: "298826497"
uid: f9ea50da-82a6-47b9-9231-8a53022d5d03
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 60Gi
storageClassName: rook-ceph-block
volumeMode: Filesystem
volumeName: pvc-f9ea50da-82a6-47b9-9231-8a53022d5d03
status:
accessModes:
- ReadWriteOnce
capacity:
storage: 60Gi
phase: Bound
E0402 12:08:00.875175 1 creator.go:175] updatePersistentVolumeClaim():clark-developer-featbit/default-chi-clickhouse-replicated-0-1-0:unable to Update PVC err: Operation cannot be fulfilled on persistentvolumeclaims "default-chi-clickhouse-replicated-0-1-0": the object has been modified; please apply your changes to the latest version and try again
E0402 12:08:00.875219 1 worker-chi-reconciler.go:1000] reconcilePVCFromVolumeMount():ERROR unable to reconcile PVC(clark-developer-featbit/default-chi-clickhouse-replicated-0-1-0) err: Operation cannot be fulfilled on persistentvolumeclaims "default-chi-clickhouse-replicated-0-1-0": the object has been modified; please apply your changes to the latest version and try again
it means someone like ArgoCD changed PVC
could you try to deploy CHI without argocd and try to rescale?
Is there a way to make it work with argo?
Errors can not lead to PVC deletion. I wonder if this is actually ArgoCD that deleted it?
@tman5 Assuming you are using Argo CD can you describe how you have configured CI/CD and exactly what are the steps you apply to make a change to volume size? It seems possible that multiple actors are trying to manage the CHI resources or at least the underlying volume.
p.s., Argo CD normally is fine with changes to storage size. I've done it many times on AWS EBS volumes.
This is my argo-cd config:
---
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: clickhouse
namespace: argo-cd
spec:
destination:
namespace: clickhouse
server: https://kube-server
project: dev
source:
path: ./overlays/dev1/clickhouse
repoURL: https://repo.local
targetRevision: master
syncPolicy:
automated:
prune: true
selfHeal: true
retry:
backoff:
duration: 5s
factor: 2
maxDuration: 3m0s
limit: 2
syncOptions:
- CreateNamespace=true
- PruneLast=true
- PrunePropagationPolicy=foreground
- ServerSideApply=true
- --sync-hook-timeout=60s
- --sync-wait=60s
It points to a repo that has a kustomize file:
---
kind: Kustomization
apiVersion: kustomize.config.k8s.io/v1beta1
resources:
- ../../../base/clickhouse-keeper/
- ../clickhouse-operator/
- manifest.yml
- clickhouse-backup-rw-password.yml
namespace: clickhouse
...
Then the manifest file is what i posted above. I edit the PVC size in that manifest, commit it to the repo and then let argo do it's thing
In the clickhouse-operator directory, this is the kustomize file:
---
kind: Kustomization
apiVersion: kustomize.config.k8s.io/v1beta1
helmCharts:
- name: altinity-clickhouse-operator
releaseName: clickhouse-operator
namespace: clickhouse
repo: https://docs.altinity.com/clickhouse-operator/
version: 0.22.2
valuesInline:
configs:
configdFiles:
01-clickhouse-02-logger.xml: |
<!-- IMPORTANT -->
<!-- This file is auto-generated -->
<!-- Do not edit this file - all changes would be lost -->
<!-- Edit appropriate template in the following folder: -->
<!-- deploy/builder/templates-config -->
<!-- IMPORTANT -->
<yandex>
<logger>
<!-- Possible levels: https://github.com/pocoproject/poco/blob/develop/Foundation/include/Poco/Logger.h#L105 -->
<level>warning</level>
<log>/var/log/clickhouse-server/clickhouse-server.log</log>
<errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
<size>1000M</size>
<count>10</count>
<!-- Default behavior is autodetection (log to console if not daemon mode and is tty) -->
<console>1</console>
</logger>
</yandex>
...
@tman5 , it is possible that there is a conflict between ArgoCD and operator. Try altering operator configuration in order to remove labels from dependent objects, including PVCs:
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseOperatorConfiguration"
metadata:
name: "exclude-argocd-label"
spec:
label:
exclude:
- argocd.argoproj.io/instance
I can confirm its clash betwen argo and the operator. Tt happed with me, using kubectl everything was good, including even i destroy the entire nodes, just left the pvc.
But one everything finish. I put my yamls itu argo, and then argo start syncing adding labels app.kubernetes.io/instance
This is what happend after i put my yamls into argo.
Info ReconcileStarted 45m clickhouse-operator reconcile started, task id: 44562628-2111-4328-a3c9-be22829c8eb9
Info UpdateCompleted 45m clickhouse-operator Update ConfigMap ch-data-warehouse/chi-jb-data-warehouse-common-configd
Info UpdateCompleted 45m clickhouse-operator Update ConfigMap ch-data-warehouse/chi-jb-data-warehouse-common-usersd
Info UpdateCompleted 45m clickhouse-operator Update Service success: ch-data-warehouse/service-dwp
Info UpdateCompleted 45m clickhouse-operator Update ConfigMap ch-data-warehouse/chi-jb-data-warehouse-common-configd
Info UpdateCompleted 45m clickhouse-operator Update ConfigMap ch-data-warehouse/chi-jb-data-warehouse-common-usersd
Info UpdateCompleted 45m clickhouse-operator Update ConfigMap ch-data-warehouse/chi-jb-data-warehouse-common-usersd
Info UpdateCompleted 44m clickhouse-operator Update ConfigMap ch-data-warehouse/chi-jb-data-warehouse-deploy-confd-dwp-0-0
Info CreateStarted 44m clickhouse-operator Update StatefulSet(ch-data-warehouse/chi-jb-data-warehouse-dwp-0-0) - started
Info UpdateInProgress 44m clickhouse-operator Update StatefulSet(ch-data-warehouse/chi-jb-data-warehouse-dwp-0-0) switch from Update to Recreate
Info CreateStarted 43m clickhouse-operator Create StatefulSet: ch-data-warehouse/chi-jb-data-warehouse-dwp-0-0 - started
Info UpdateCompleted 38m clickhouse-operator Update ConfigMap ch-data-warehouse/chi-jb-data-warehouse-common-usersd
Info CreateCompleted 38m clickhouse-operator Create StatefulSet: ch-data-warehouse/chi-jb-data-warehouse-dwp-0-0 - completed
Info UpdateCompleted 38m clickhouse-operator Update Service success: ch-data-warehouse/service-jb-data-warehouse-0-0
Info ProgressHostsCompleted 38m clickhouse-operator [now: 2025-02-12 04:59:36.643838886 +0000 UTC m=+86464.463691633] ProgressHostsCompleted: 1 of 4
Info UpdateCompleted 38m clickhouse-operator Update ConfigMap ch-data-warehouse/chi-jb-data-warehouse-common-configd
Info ReconcileCompleted 38m clickhouse-operator Reconcile Host completed. Host: 0-0 ClickHouse version running: 24.8.13.16
Info UpdateCompleted 38m clickhouse-operator Update Service success: ch-data-warehouse/service-jb-data-warehouse
Info UpdateCompleted 38m clickhouse-operator Update ConfigMap ch-data-warehouse/chi-jb-data-warehouse-common-usersd
Info UpdateCompleted 38m clickhouse-operator Update ConfigMap ch-data-warehouse/chi-jb-data-warehouse-common-configd
Info UpdateCompleted 38m clickhouse-operator Update ConfigMap ch-data-warehouse/chi-jb-data-warehouse-common-usersd
Info UpdateCompleted 37m clickhouse-operator Update ConfigMap ch-data-warehouse/chi-jb-data-warehouse-deploy-confd-dwp-0-1
Info CreateStarted 37m clickhouse-operator Update StatefulSet(ch-data-warehouse/chi-jb-data-warehouse-dwp-0-1) - started
Info UpdateInProgress 36m clickhouse-operator Update StatefulSet(ch-data-warehouse/chi-jb-data-warehouse-dwp-0-1) switch from Update to Recreate
Info CreateStarted 36m clickhouse-operator Create StatefulSet: ch-data-warehouse/chi-jb-data-warehouse-dwp-0-1 - started