cockroach-operator
cockroach-operator copied to clipboard
Cluster fails to deploy when using Kyverno to mutate image name
Hello, I've deployed the operator in our cluster using the manifests, everything works as expected during install.
When deploying a cluster using the example.yaml, the vcheck job seems to run indefinitely. The job pod logs shows the version as expected. The operator logs don't point to any specific error at first, but after some time there seems to be some generic error messages.
After some trial and error I was able to determine that the issue seems to be from a Kyverno policy that we use to patch images to use our pullthrough cache.
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: replace-dockerhub-registry-container
spec:
background: true
rules:
# Add the registry and tag if set implicitly
- name: set-default-registry-container
match:
resources:
kinds:
- Pod
mutate:
foreach:
# Containers
- list: "request.object.spec.containers"
patchStrategicMerge:
spec:
containers:
- (name): "{{ element.name }}"
(image): |-
!{{ images.containers."{{ element.name }}".referenceWithTag }}
image: |-
{{ images.containers."{{ element.name }}".referenceWithTag }}
# For all non official dockerhub images, replace the registry with the ECR pull through cache
- name: replace-dockerhub-registry-container
match:
resources:
kinds:
- Pod
preconditions:
any:
- key: "{{ request.object.spec.containers[*].image }}"
operator: AnyIn
value: "docker.io/*/*"
mutate:
foreach:
# Containers
- list: "request.object.spec.containers"
patchStrategicMerge:
spec:
containers:
- (name): "{{ element.name }}"
(image): |-
docker.io/*/*:{{images.containers."{{element.name}}".tag}}
image: our.pullthrough.cache.url/{{ images.containers."{{element.name}}".path }}:{{images.containers."{{element.name}}".tag}}
# The only remaining dockerhub images are offical and can be replaced with the ECR pull through cache
# Offical images require the prefix '/library'
- name: replace-official-dockerhub-registry-container
match:
resources:
kinds:
- Pod
preconditions:
any:
- key: "{{ request.object.spec.containers[*].image }}"
operator: AnyIn
value: "docker.io/*"
mutate:
foreach:
# Containers
- list: "request.object.spec.containers"
patchStrategicMerge:
spec:
containers:
- (name): "{{ element.name }}"
(image): |-
docker.io/*:{{images.containers."{{element.name}}".tag}}
image: our.pullthrough.cache.url/library/{{ images.containers."{{element.name}}".path }}:{{images.containers."{{element.name}}".tag}}
Removing this policy allowed the cluster to initiate properly.