feat: Add package registry to eck
Resolves #8925
Elastic Package Registry (EPR) has been highly requested to be added to ECK.
EPR does not have any references since it does not require a license nor any other application.
The following was implemented for EPR
- defaults to TLS
- Sets the default container image to docker.elastic.co/package-registry/distribution
- Users can set their own images
- Users can update the config following the reference
- Kibana can reference the EPR like Elasticsearch and Enterprise Search
- If Kibana references EPR and TLS is enabled it will populate
xpack.fleet.registryUrland set the environment variableNODE_EXTRA_CA_CERTSto the path of EPR's CA which is mounted - If a user provides their own
NODE_EXTRA_CA_CERTSwith a mount the controller will combine the certs appending the EPR's CA to the users specified CA
This was tested with and without setting NODE_EXTRA_CA_CERTS using the below manifest
apiVersion: epr.k8s.elastic.co/v1alpha1
kind: ElasticPackageRegistry
metadata:
name: registry
spec:
version: 9.1.2
count: 1
podTemplate:
spec:
containers:
- name: package-registry
image: docker.elastic.co/package-registry/distribution:lite-9.1.2
---
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:
name: elasticsearch
spec:
version: 9.1.2
nodeSets:
- name: default
count: 1
---
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:
name: kibana
spec:
version: 9.1.2
count: 1
elasticsearchRef:
name: elasticsearch
packageRegistryRef:
name: registry
config:
telemetry.optIn: false
xpack.fleet.isAirGapped: true
xpack.fleet.agents.elasticsearch.hosts: ["https://elasticsearch-es-http.default.svc:9200"]
xpack.fleet.agents.fleet_server.hosts: ["https://fleet-server-agent-http.default.svc:8220"]
xpack.fleet.packages:
- name: system
version: latest
- name: elastic_agent
version: latest
- name: fleet_server
version: latest
xpack.fleet.agentPolicies:
- name: Fleet Server on ECK policy
id: eck-fleet-server
namespace: default
monitoring_enabled:
- logs
- metrics
unenroll_timeout: 900
package_policies:
- name: fleet_server-1
id: fleet_server-1
package:
name: fleet_server
podTemplate:
spec:
containers:
- name: kibana
env:
- name: NODE_EXTRA_CA_CERTS
value: /custom/user/ca-bundle.crt
volumeMounts:
- name: custom-ca
mountPath: /custom/user
readOnly: true
volumes:
- name: custom-ca
secret:
secretName: user-custom-ca-secret
---
apiVersion: v1
kind: Secret
metadata:
name: user-custom-ca-secret
namespace: default
type: Opaque
data:
ca-bundle.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUZtVENDQTRHZ0F3SUJBZ0lVYjVrK2d6V3A5YjljWTV4bkhUcWZNdHFHUXIwd0RRWUpLb1pJaHZjTkFRRUwKQlFBd1hERUxNQWtHQTFVRUJoTUNXRmd4RlRBVEJnTlZCQWNNREVSbFptRjFiSFFnUTJsMGVURWNNQm9HQTFVRQpDZ3dUUkdWbVlYVnNkQ0JEYjIxd1lXNTVJRXgwWkRFWU1CWUdBMVVFQXd3UGRHVnpkQzVsYkdGemRHbGpMbU52Ck1CNFhEVEkxTURneU1ERTRNakl3T0ZvWERUTTFNRGd4T0RFNE1qSXdPRm93WERFTE1Ba0dBMVVFQmhNQ1dGZ3gKRlRBVEJnTlZCQWNNREVSbFptRjFiSFFnUTJsMGVURWNNQm9HQTFVRUNnd1RSR1ZtWVhWc2RDQkRiMjF3WVc1NQpJRXgwWkRFWU1CWUdBMVVFQXd3UGRHVnpkQzVsYkdGemRHbGpMbU52TUlJQ0lqQU5CZ2txaGtpRzl3MEJBUUVGCkFBT0NBZzhBTUlJQ0NnS0NBZ0VBMHljTGVySWR3LzdpbGlKMzVBUEZ4bUx6TFRnNWRhUStWSUttS2lNbStlTTYKanJOY3lnbGphNVFEbHYvMStGUm5hamhrRTBobHoycXEzTjk0U1pYN3M2eHBnQUVzMGVQQ3VaZVBNU2VUYlYyRgp0YlIxNnFuM0JjenVxN3laOXZwdHR3MmJRdkJkY3JzZFU4T2RYUWhGNFd4QUFwODRKYWlMNmkzMlA2K2VPODBwCmh3Z1kwS0F1bzZoZC8zaFpNME14M2MwRmJmU0JHaTUyOHZKODYzUDRXZlEwMWdtUUxVbGl0UlhhTUhiaDRXSm0KOU45c0psUXpnbkNuQjZ6YkZjZ2gweWxrakd0UzBIZEo3eSs3dmE0Q1BqdkxlWGpwTnZuQzRjTmlocnp4Wmw5bQphM0ZVdVpiU0lRekE2ZFlkdkdrT2V3OTJEek1BaTdldU14UDdyYVhRejZmc1N6U1V4N1RjQWl5M2E5VU9Fdi9rCk5NV3VTbDlUMHRRSkhJSzJMc0t0MlVKWVVHWk4wOWU2SUVSTlJOL0FIUjVDbTlhcVQ1Q2ZyQW9JVVhNdUg2S1oKN1JCZFFockRxL2xEQk54bWs5dW44V2lic0NSVnkvVXRJQ3lOSytxbGpGUWZEd01hNkRkd3BjcnpnTWZnU3RTawpLek1LRUJla2N0Q0Q4dHNmTjZYem5USmNBYUJETzFlQWZyT0Z2NG1PTXJqVG90OEYvK3pxN0dXNTlqWTRvdFhMCkY3TnpadFl0eWsvbDRvb2hUZUFuM1ptd1BDMGJFQ1FkTmpTVkZ6ZXJCamE4ZjhacGpKRzNjUllyVmh6YUNsRWMKRU5wbFRHcldVaUVwRDdnTnNlNWNDSnZpQU12NHdwait2QTVVNlA3Z0MxUUtKV2hWS3BVYWcvTmtTSUFCRmtrQwpBd0VBQWFOVE1GRXdIUVlEVlIwT0JCWUVGTWdldEVJajZtRWdsZURGNkVNdUY4NXVnYzdZTUI4R0ExVWRJd1FZCk1CYUFGTWdldEVJajZtRWdsZURGNkVNdUY4NXVnYzdZTUE4R0ExVWRFd0VCL3dRRk1BTUJBZjh3RFFZSktvWkkKaHZjTkFRRUxCUUFEZ2dJQkFEOFU3dm1yWmhHTUZiV2YzRDZlNy84TUwzWEhLRk5TNy9UeWF3U2tvdGVSTVdFbgp1RWhQK2dmbkdUT2ZITFlQeHl5eEJ4U041T29sZHRJclo5dnhBc2dlYWJzSkJaenhQVHpxU09VN3h3b09LcTlRCmdKRUYxL0ZmemFlR1V5dVE2S1ZaZ0QvZ1JPSW42Ri9OUGlzM1pvbUpPOStuVWdTTnNiUm9RYmdPUGdPV3Q3Z1gKVEhuOHJpdUp2OXRPNFBRN09Sa3pubDJYbERlcE9xNVpwSUtkcVl0Rm5MUjF3SllyREZESmt0Q3h6MzFob0FrZwpSVjlSU1BSMFFxZ1JQeFNpNGpXdkNGUk5XTUFJc0NadGJsWExRRUljWGI1YnlsWXV2a3psTTJ4dHlHK3FaRFhMCnFoZDVNeFZIUkpqTzE1VEdpZXFRcUpMVkZyVElhTHFoaXZpQ1pUbDJoVkYxVlpPVG05MU5aeE53M25RL3JyeDgKK2VQV2xTWlZKWXc3SDRkWkx5WTFjRUxLT0YrZDJybVNSZ2pWaHZycUZ3R1M3MUQzYkV4Y0dSakNrOHNQWEZyRwpsOFRzY05RMXBPSGVuNlJhOFhVdGtxU1doZllFb3owZjBEem4wYmt4c2VWaCttS1BHV3QxcHdlemVFTFVwaHE3CmwwSVRLeis1b1lqYWVHTDRia25kcWlpemwzWkc2N0lYL3VyR0dQVUxkLzU1NEtRMFFPMS92S3Y2dE1YMWc0dVMKWHdWc0pzQjlrTUIwRFFxbDhRYmg0UEJ2ZW9RRTZvL3BycXRtWjR1RWdDMCt1cm5paDlCY1FweFNKOUljR1kxTQpBQzRBcG5Pem1CYTFhUVBMcDRaRFIxQXpFK1hXWDd2WWNWYUxleUJxRzRja3dwbUtOUnhpcnJjS2NaMkYKLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
---
apiVersion: agent.k8s.elastic.co/v1alpha1
kind: Agent
metadata:
name: fleet-server
spec:
version: 9.1.2
kibanaRef:
name: kibana
elasticsearchRefs:
- name: elasticsearch
mode: fleet
fleetServerEnabled: true
policyID: eck-fleet-server
deployment:
replicas: 1
podTemplate:
spec:
serviceAccountName: fleet-server
automountServiceAccountToken: true
resources:
requests:
cpu: 200m
memory: 1Gi
limits:
cpu: 1
memory: 2Gi
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fleet-server
namespace: default
rules:
- apiGroups: [""]
resources:
- pods
- namespaces
- nodes
verbs:
- get
- watch
- list
- apiGroups: ["apps"]
resources:
- replicasets
verbs:
- get
- watch
- list
- apiGroups: ["batch"]
resources:
- jobs
verbs:
- get
- watch
- list
- apiGroups: ["coordination.k8s.io"]
resources:
- leases
verbs:
- get
- create
- update
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: fleet-server
namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: fleet-server
namespace: default
subjects:
- kind: ServiceAccount
name: fleet-server
namespace: default
roleRef:
kind: ClusterRole
name: fleet-server
apiGroup: rbac.authorization.k8s.io
:white_check_mark: Snyk checks have passed. No issues have been found so far.
| Status | Scanner | Total (0) | ||||
|---|---|---|---|---|---|---|
| :white_check_mark: | Open Source Security | 0 | 0 | 0 | 0 | 0 issues |
| :white_check_mark: | Licenses | 0 | 0 | 0 | 0 | 0 issues |
:computer: Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.
For some reason the container does not seem to listen on
8080and is killed by the kubelet:
I have the same problem when I try to run the e2e test TestElasticPackageRegistryStandalone:
Containers:
package-registry:
Container ID: containerd://964dc18c8a1b461f7bba941dfc688af4375ce6e00a247f3326cd614622b74b89
Image: docker.elastic.co/package-registry/distribution:9.0.5
Image ID: docker.elastic.co/package-registry/distribution@sha256:15edf005ee2cb3a9611e1dae535c506134de98f6e90eaaa8419a3161b4c7b858
Port: 8080/TCP
Host Port: 0/TCP
State: Running
Started: Tue, 26 Aug 2025 12:41:15 +0200
Last State: Terminated
Reason: Error
Exit Code: 2
Started: Tue, 26 Aug 2025 12:38:55 +0200
Finished: Tue, 26 Aug 2025 12:41:15 +0200
Ready: False
Restart Count: 1
[...]
Normal Killing 2m3s kubelet Container package-registry failed startup probe, will be restarted
Warning Unhealthy 3s (x4 over 2m23s) kubelet Startup probe failed: Get "https://10.31.86.16:8080/health": dial tcp 10.31.86.16:8080: connect: connection refused
@barkbay it takes a long time (several minutes) for the EPR to start. Can it be your issue?
@barkbay it takes a long time (several minutes) for the EPR to start. Can it be your issue?
I'm using the default startup probe set by the controller in this PR:
startupProbe:
failureThreshold: 3
httpGet:
path: /health
port: 8080
scheme: HTTPS
initialDelaySeconds: 120
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
I can try to increase/remove it..
@barkbay it takes a long time (several minutes) for the EPR to start. Can it be your issue?
Good catch @jeanfabrice ! It took something like 5 minutes for the packages to be "loaded", the default probe is maybe a bit too optimistic, the container is prematurely killed.
@barkbay it takes a long time (several minutes) for the EPR to start. Can it be your issue?
Good catch @jeanfabrice ! It took something like 5 minutes for the packages to be "loaded", the default probe is maybe a bit too optimistic, the container is prematurely killed.
It 100% is the startup probe time. I had a lot of issues with this and at one point had it set at 10 minutes. I think 7 minutes might be a good in between? The problem arises when a user runs the production image which is around 10+ GB and takes upwards of 10 minutes to start.
@barkbay With the startup taking anywhere from 2-10+ minutes do we want to set the initial delay to still be 120 seconds but increase the failure threshold, and timeoutseconds to account for the 10+ minutes?
Something like this where the max amount of time is 10 minutes (600 seconds)
// startupProbe is the startup probe for the packageregistry container
func startupProbe(useTLS bool) corev1.Probe {
scheme := corev1.URISchemeHTTP
if useTLS {
scheme = corev1.URISchemeHTTPS
}
return corev1.Probe{
FailureThreshold: 16,
InitialDelaySeconds: 120,
PeriodSeconds: 10,
SuccessThreshold: 1,
TimeoutSeconds: 30,
ProbeHandler: corev1.ProbeHandler{
HTTPGet: &corev1.HTTPGetAction{
Port: intstr.FromInt(HTTPPort),
Path: "/health",
Scheme: scheme,
},
},
}
}
It 100% is the startup probe time. I had a lot of issues with this and at one point had it set at 10 minutes. I think 7 minutes might be a good in between?
I think my question would be "why do we need a startup probe"? Maybe a readiness probe is enough?
It 100% is the startup probe time. I had a lot of issues with this and at one point had it set at 10 minutes. I think 7 minutes might be a good in between?
I think my question would be "why do we need a startup probe"? Maybe a readiness probe is enough?
Ahh yes I guess the application has already started therefore a startup probe doesnt make sense here.
@tehbooom 👋 please let me know when you need another review, thanks!
@tehbooom 👋 please let me know when you need another review, thanks!
@barkbay
Added EPR to ECK diagnostics here.
I still need to update our documentation. I know we changed how we do documentation so if you could please point me in the right direction where ECK docs live that would be great. This PR is ready for another review. Thanks!
I still need to update our documentation. I know we changed how we do documentation so if you could please point me in the right direction where ECK docs live that would be great. This PR is ready for another review. Thanks!
Documentation has moved here: https://github.com/elastic/docs-content Please note that the main branch of that repo is immediately published, therefore do not merge the doc PR until the feature has been released.
I'll try to take another look to your PR this week, I'm struggling to keep up with the pace of PRs opened in this repo 😅
buildkite test this -f p=gke,E2E_TAGS=epr
buildkite test this -f p=gke,E2E_TAGS=epr
E2E tests are still failing with:
Warning Failed 12m kubelet Failed to pull image "docker.elastic.co/package-registry/distribution:9.1.2": failed to pull and unpack image "docker.elastic.co/package-registry/distribution:9.1.2": failed to extract layer sha256:57bbe197467b8b19ba0705f05ee41860ff3bb44020ed5986df96bfed4614e630: write /var/lib/containerd/io.containerd.snapshotter.v1.overlayfs/snapshots/396/fs/packages/package-storage/security_detection_engine-8.6.8.zip: no space left on device
We also have to find a solution for this. IIUC https://github.com/elastic/package-registry/pull/1335 would be the way to go? I'll also try to check if we can increase the disk on GKE nodes, but it would still mean that we may have to skip other providers (AWS, Azure, Kind...).
IIUC the image currently requires ~14Gi:
docker.elastic.co/package-registry/distribution 9.1.2 d127b26dc3000 13.8GB
buildkite test this -f p=gke,E2E_TAGS=epr
buildkite test this -f p=gke,E2E_TAGS=epr
Main blocker to merge this imo is the lack of UBI images for the package registry.
This blocker has been addressed in https://github.com/elastic/package-registry/pull/1451, which is now merged.
Main blocker to merge this imo is the lack of UBI images for the package registry.
This blocker has been addressed in elastic/package-registry#1451, which is now merged.
In this PR we are using the Package Registry distribution images. To support UBI there we would also need to update https://github.com/elastic/package-storage-infra/blob/13bf4e9ba03c028b16ed37772cd0d1afaa45af4f/.buildkite/scripts/build_distributions.sh.
Main blocker to merge this imo is the lack of UBI images for the package registry.
This blocker has been addressed in elastic/package-registry#1451, which is now merged.
In this PR we are using the Package Registry distribution images. To support UBI there we would also need to update https://github.com/elastic/package-storage-infra/blob/13bf4e9ba03c028b16ed37772cd0d1afaa45af4f/.buildkite/scripts/build_distributions.sh.
The beginnings of this needed PR are here @jsoriano
Definitely seeing some issues testing on openshift, but it doesn't seem like it's ocp specific:
{"log.level":"error","@timestamp":"2025-11-20T17:44:02.025Z","log.logger":"manager.eck-operator","message":"Reconciler error","service.version":"9.3.0-SNAPSHOT+","service.type":"eck","ecs.version":"1.4.0","controller":"packageregistry-controller","object":{"name":"registry","namespace":"elastic"},"namespace":"elastic","name":"registry","reconcileID":"8a30e396-91e2-4a86-a9b3-79368a4032a6","error":"services \"registry-epr-http\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>","errorCauses":[{"error":"services \"registry-epr-http\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"}],"error.stack_trace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:474\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:421\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func1.1\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:296"}
buildkite test this -f p=gke,E2E_TAGS=epr
Definitely seeing some issues testing on openshift, but it doesn't seem like it's ocp specific:
{"log.level":"error","@timestamp":"2025-11-20T17:44:02.025Z","log.logger":"manager.eck-operator","message":"Reconciler error","service.version":"9.3.0-SNAPSHOT+","service.type":"eck","ecs.version":"1.4.0","controller":"packageregistry-controller","object":{"name":"registry","namespace":"elastic"},"namespace":"elastic","name":"registry","reconcileID":"8a30e396-91e2-4a86-a9b3-79368a4032a6","error":"services \"registry-epr-http\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>","errorCauses":[{"error":"services \"registry-epr-http\" is forbidden: cannot set blockOwnerDeletion if an ownerReference refers to a resource you can't set finalizers on: , <nil>"}],"error.stack_trace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).reconcileHandler\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:474\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).processNextWorkItem\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:421\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller[...]).Start.func1.1\n\t/root/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:296"}
Nope, it's ocp specific: https://github.com/elastic/cloud-on-k8s/pull/8800/commits/49b1e56493bcb567a61bbeec229598edfba2c6b3. (was missing packageregistries/finalizers RBAC permissions)
Nope, it's ocp specific: 49b1e56. (was missing
packageregistries/finalizersRBAC permissions)
And more fun on ocp:
{"log.level":"info","@timestamp":"2025-11-20T20:49:24.212Z","log.logger":"manager.eck-operator","message":"would violate PodSecurity \"restricted:latest\": runAsNonRoot != true (pod or container \"package-registry\" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container \"package-registry\" must set securityContext.seccompProfile.type to \"RuntimeDefault\" or \"Localhost\")","service.version":"3.3.0-rc1-SNAPSHOT+","service.type":"eck","ecs.version":"1.4.0","controller":"packageregistry-controller","object":{"name":"registry","namespace":"elastic"},"namespace":"elastic","name":"registry","reconcileID":"2521d32a-cdc5-4c36-ba90-64011f78d67b"}
And more fun on ocp:
And we don't set runasuser/runasgroup for any CRD either:
containers[0].runAsUser: Invalid value: 1000: must be in the ranges: [1000730000, 1000739999]
I'm fixing all of these issues....
Nope, it's ocp specific: 49b1e56. (was missing
packageregistries/finalizersRBAC permissions)And more fun on ocp:
{"log.level":"info","@timestamp":"2025-11-20T20:49:24.212Z","log.logger":"manager.eck-operator","message":"would violate PodSecurity \"restricted:latest\": runAsNonRoot != true (pod or container \"package-registry\" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container \"package-registry\" must set securityContext.seccompProfile.type to \"RuntimeDefault\" or \"Localhost\")","service.version":"3.3.0-rc1-SNAPSHOT+","service.type":"eck","ecs.version":"1.4.0","controller":"packageregistry-controller","object":{"name":"registry","namespace":"elastic"},"namespace":"elastic","name":"registry","reconcileID":"2521d32a-cdc5-4c36-ba90-64011f78d67b"}
All of the issues running in Openshift/OCP-style clusters have been resolved, and verified. I'm waiting to verify UBI images specifically once they are built/pushed and this should be getting closer to a merging state.
All of the issues running in Openshift/OCP-style clusters have been resolved, and verified. I'm waiting to verify UBI images specifically once they are built/pushed and this should be getting closer to a merging state.
The UBI images seem to run without issue:
openshift.io/scc: restricted-v2
packageregistry.k8s.elastic.co/config-hash: 2422330696
seccomp.security.alpha.kubernetes.io/pod: runtime/default
security.openshift.io/validated-scc-subject-type: user
Status: Running
SeccompProfile: RuntimeDefault
IP: 10.129.2.98
IPs:
IP: 10.129.2.98
Controlled By: ReplicaSet/registry-epr-858f669ff
Containers:
package-registry:
Container ID: cri-o://a1e3ce5cf092d7b636a9d24b08ef6bd2d93e45685dcb3d01b4a6bf872a51db79
Image: docker.elastic.co/package-registry/distribution:lite-ubi
I think the one final change is to ensure that in an ocp environment we are using the ubi images by default. This seem to differ from the standard stack images which are UBI by default from 9.x forward. I'll make the changes and verify.
All of the issues running in Openshift/OCP-style clusters have been resolved, and verified. I'm waiting to verify UBI images specifically once they are built/pushed and this should be getting closer to a merging state.
The UBI images seem to run without issue:
openshift.io/scc: restricted-v2 packageregistry.k8s.elastic.co/config-hash: 2422330696 seccomp.security.alpha.kubernetes.io/pod: runtime/default security.openshift.io/validated-scc-subject-type: user Status: Running SeccompProfile: RuntimeDefault IP: 10.129.2.98 IPs: IP: 10.129.2.98 Controlled By: ReplicaSet/registry-epr-858f669ff Containers: package-registry: Container ID: cri-o://a1e3ce5cf092d7b636a9d24b08ef6bd2d93e45685dcb3d01b4a6bf872a51db79 Image: docker.elastic.co/package-registry/distribution:lite-ubiI think the one final change is to ensure that in an ocp environment we are using the ubi images by default. This seem to differ from the standard stack images which are UBI by default from 9.x forward. I'll make the changes and verify.
The suffix should handle this when --ubi-only is set. I believe this is how we normally handle this in other controllers.
https://github.com/elastic/cloud-on-k8s/pull/8800/files#diff-52e0749d4ea9659ff8934fe1491cc88fc5508988f026b1ca8a0704e3a75da924R107-R111
Edit: Nevermind, it's because I used short lived certificates to test certificate rotation, and it seems it requires the Pod to be recreated. @tehbooom Could you confirm that certificates are not hot reloaded?
@barkbay I believe that the certificates are not hot reloaded, quoting documentation
The NODE_EXTRA_CA_CERTS environment variable is only read when the Node.js process is first launched.
Looking also at the code it seems to me that the env var and the contents of the path (if the former is set) are read once code link