kubegres
kubegres copied to clipboard
imagePullSecrets not used in backup cronjob templates
If imagePullSecrets are used to pull main database images, they are not used for backup templates. So if the backup job is scheduled on a pod without database image, it fails pulling the image.
Manually adding imagePullSecrets to CronJob definition works but this is a dirty workaround :)
Thank you for reporting this issue when imagePullSecrets are used to pull main database images.
Could you please share a YAML example where your configuration does not work?
thank you for your quick reply.. here is some more info :
Launching this yaml file on a cluster (at scaleway.com => scw-xxx) with 1 node is fine
apiVersion: v1
data:
.dockerconfigjson: REDACTED
kind: Secret
metadata:
name: registry-secret
namespace: kubegres-sandbox
type: kubernetes.io/dockerconfigjson
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: kubegres-backup-issue-78-pvc
namespace: kubegres-sandbox
spec:
storageClassName: scw-bssd
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
---
apiVersion: kubegres.reactive-tech.io/v1
kind: Kubegres
metadata:
name: kubegres-issue-78
namespace: kubegres-sandbox
spec:
replicas: 1
image: my-private-registry/image:tag
imagePullSecrets:
- name: registry-secret
database:
size: 1Gi
storageClassName: scw-bssd
failover:
isDisabled: true
backup:
schedule: "*/3 * * * *"
pvcName: kubegres-backup-issue-78-pvc
volumeMount: /var/lib/backup
env:
- name: POSTGRES_PASSWORD
value: supassword
- name: POSTGRES_REPLICATION_PASSWORD
value: reppassword
Backup CronJob is created :
kind: CronJob
apiVersion: batch/v1beta1
metadata:
name: backup-kubegres-issue-78
namespace: kubegres-sandbox
uid: 499fa9f5-5a59-48d9-8fa2-7b08c74ca34b
resourceVersion: '1469331142'
generation: 1
creationTimestamp: '2021-12-27T14:02:36Z'
ownerReferences:
- apiVersion: kubegres.reactive-tech.io/v1
kind: Kubegres
name: kubegres-issue-78
uid: db3cb326-bc37-434e-b3ba-44175b083eb6
controller: true
blockOwnerDeletion: true
spec:
schedule: '*/3 * * * *'
concurrencyPolicy: Forbid
suspend: false
jobTemplate:
metadata:
creationTimestamp: null
spec:
template:
metadata:
creationTimestamp: null
spec:
volumes:
- name: backup-volume
persistentVolumeClaim:
claimName: kubegres-backup-issue-78-pvc
- name: postgres-config
configMap:
name: base-kubegres-config
defaultMode: 511
containers:
- name: backup-postgres
image: my-private-registry/image:tag
args:
- sh
- '-c'
- /tmp/backup_database.sh
env:
- name: PGPASSWORD
- name: KUBEGRES_RESOURCE_NAME
value: kubegres-issue-78
- name: BACKUP_DESTINATION_FOLDER
value: /var/lib/backup
- name: BACKUP_SOURCE_DB_HOST_NAME
value: kubegres-issue-78
- name: POSTGRES_PASSWORD
value: supassword
- name: POSTGRES_REPLICATION_PASSWORD
value: reppassword
resources: {}
volumeMounts:
- name: backup-volume
mountPath: /var/lib/backup
- name: postgres-config
mountPath: /tmp/backup_database.sh
subPath: backup_database.sh
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: IfNotPresent
restartPolicy: OnFailure
terminationGracePeriodSeconds: 30
dnsPolicy: ClusterFirst
securityContext: {}
schedulerName: default-scheduler
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
status:
lastScheduleTime: '2021-12-27T14:06:00Z'
lastSuccessfulTime: '2021-12-27T14:06:05Z'
the imagePullSecrets is missing there.. on my single node cluster, jobs are OK, because my private image is already present (imagePullPolicy: IfNotPresent)
backup-kubegres-issue-78-27343563 1/1 17s 7m46s
backup-kubegres-issue-78-27343566 1/1 5s 4m46s
But if i add some nodes to the cluster
NAME STATUS ROLES AGE VERSION
scw-k8s-solidev-default-34b29ff7154b4452a4ace2 Ready <none> 20d v1.23.0
scw-k8s-solidev-default-817558573a0244e09202dc Ready <none> 6m49s v1.23.0
scw-k8s-solidev-default-ee42e6727e114831aeaac8 Ready <none> 6m24s v1.23.0
the backup job fails depending of which node it is scheduled :
✦ ➜ kubectl get jobs
NAME COMPLETIONS DURATION AGE
backup-kubegres-issue-78-27343563 1/1 17s 9m57s
backup-kubegres-issue-78-27343566 1/1 5s 6m57s
backup-kubegres-issue-78-27343569 0/1 3m57s 3m57s
✦ ➜ kubectl describe jobs/backup-kubegres-issue-78-27343569
Name: backup-kubegres-issue-78-27343569
Namespace: kubegres-sandbox
(...)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 5m33s job-controller Created pod: backup-kubegres-issue-78-27343569-hb2d6
✦ ➜ kubectl describe pods/backup-kubegres-issue-78-27343569-hb2d6
Name: backup-kubegres-issue-78-27343569-hb2d6
Namespace: kubegres-sandbox
Priority: 0
Node: scw-k8s-solidev-default-817558573a0244e09202dc/10.197.230.31
Start Time: Mon, 27 Dec 2021 15:09:00 +0100
(...)
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 6m12s default-scheduler Successfully assigned kubegres-sandbox/backup-kubegres-issue-78-27343569-hb2d6 to scw-k8s-solidev-default-817558573a0244e09202dc
Normal SuccessfulAttachVolume 6m11s attachdetach-controller AttachVolume.Attach succeeded for volume "pvc-cef4f654-d552-4a9d-b7c5-1accd1757530"
Normal Pulling 4m46s (x4 over 6m8s) kubelet Pulling image "my-private-registry/image:tag"
Warning Failed 4m46s (x4 over 6m8s) kubelet Failed to pull image "my-private-registry/image:tag": rpc error: code = Unknown desc = failed to pull and unpack image "my-private-registry/image:tag": failed to resolve reference "my-private-registry/image:tag": failed to authorize: failed to fetch anonymous token: unexpected status: 403 Forbidden
Warning Failed 4m46s (x4 over 6m8s) kubelet Error: ErrImagePull
Warning Failed 4m18s (x6 over 6m7s) kubelet Error: ImagePullBackOff
Normal BackOff 58s (x20 over 6m7s) kubelet Back-off pulling image "my-private-registry/image:tag"
the private image have not been pulled on new nodes, so when the backup is scheduled on one of these new nodes, as the imagePullSecrets field is missing in CronJob spec, the container cannot be pulled.
Thank you for those details which will help me with the investigation.
Hi @alex-arica sent a PR #103 to fix this issue.