Reconcile not triggered on updates to container status
My controller owns and is responsible for making a deployment. The one container in this deployment has a liveness and startup probe. When the probes reach their failure threshold kubelet restarts the container. On this event when kubelet restarts the container I'd like to receive a reconcile to retrieve the latest container status (which I am currently not).
My understanding was that because my controller Owns(&corev1.Pod{}) (and the pod has the correct owner refs), that the update event on the pod triggered by kubelet of Warning Unhealthy 4m17s (x45 over 39m) kubelet Liveness probe failed: HTTP probe failed with statuscode: 503 would cause a reconcile.
Apologies if I am missing something.
Manager setup:
func ignoreStatusUpdates() predicate.Predicate {
return predicate.Funcs{
UpdateFunc: func(e event.UpdateEvent) bool {
// Ignore updates to CR status in which case metadata.Generation does not change
return e.ObjectOld.GetGeneration() != e.ObjectNew.GetGeneration()
},
}
}
func (r *MyController) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&appv1alpha1.MyCR{}, builder.WithPredicates(ignoreStatusUpdates())).
Owns(&appsv1.Deployment{}).
Owns(&corev1.Pod{}).
Owns(&corev1.Secret{}).
Owns(&corev1.ConfigMap{}).
Owns(&corev1.Service{}).
Complete(r)
}
Are you not getting reconciles for the MyCR parent object when the pod restarts?
Usually when you have an owned object, you would check those owned objects in your reconcile loop if they triggered the parent object to be reconciled. The owned object creates an event for the parent CR to be reconciled and in the Reconcile method, you would check to ensure that the owned object is what you suggest it to be to fulfill the MyCR state for the cluster.
I think this only works if the Pod has an ownerRef to MyCR. Is that the case?
For more details, see: https://github.com/kubernetes-sigs/controller-runtime/blob/aeac9c59d7047a66d1d4ef521f66042b024cbb3b/pkg/builder/controller.go#L344
Apologies for the slow reply.
Are you not getting reconciles for the MyCR parent object when the pod restarts?
From my understanding kubelet will restart the container and not the pod on failure of liveness/readiness/start-up probe(s). The pod does get an event of the container restarts though my reconciliation still isn't trigered, e.g., Warning Unhealthy 3m52s (x5 over 4m4s) kubelet Startup probe failed: HTTP probe failed with statuscode: 503
In my case I have a deployment which has a pod template which defines the unhealthy containers (Deployment -> Pod -> Container). Deployments and Pods implement the Object interface so can therefore have an owner reference defined, which I have done - both being owned by MyCR.
I think this only works if the Pod has an ownerRef to MyCR. Is that the case?
@sbueringer Yes this is the case, am I missing something?
Probably this one: https://github.com/kubernetes-sigs/controller-runtime/blob/main/pkg/builder/controller.go#L118
// The default behavior reconciles only the first controller-type OwnerReference of the given type. // Use Owns(object, builder.MatchEveryOwner) to reconcile all owners.
I assume you are not setting controller: true on the ownerRef, so you'lll have to use builder.MatchEveryOwner
@sbueringer I'm setting controller: true on the pod's ownerRef:
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: my-cr-12345
uid: xxx
Not too sure why the MyCR kind gets set as ReplicaSet here and not MyCR - this is handled using controllerutil.SetOwnerReference.
Because it is the controller, this should be ok without builder.MatchEveryOwner right?
There can only be one ownerRef with controller: true per object, and in this case it should be ReplicaSet not your CRD
And no if your CRD is not set with controller: true it won't work without builder.MatchEveryOwner
Thanks for the info, assuming I've set this in the correct place (see below) I am still not getting reconcile events as originally described
// SetupWithManager sets up the controller with the Manager.
func (r *MyController) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&appv1alpha1.MyCR{}, builder.WithPredicates(ignoreStatusUpdates())).
Owns(&appsv1.Deployment{}).
Owns(&corev1.Pod{}, builder.MatchEveryOwner).
Owns(&corev1.Secret{}).
Owns(&corev1.ConfigMap{}).
Owns(&corev1.Service{}).
Complete(r)
Then no idea
func ignoreStatusUpdates() predicate.Predicate {
return predicate.Funcs{
UpdateFunc: func(e event.UpdateEvent) bool {
// Ignore updates to CR status in which case metadata.Generation does not change
return e.ObjectOld.GetGeneration() != e.ObjectNew.GetGeneration()
},
}
}
Wouldn't this not give you the events from an owned object? Meaning that the generation of the parent object isn't changed so you may not get the event when the owned object triggers the reconciliation of the parent object.
https://github.com/kubernetes-sigs/controller-runtime/issues/2684#issuecomment-1942450084
Wouldn't this not give you the events from an owned object?
From doing some debugging this event filter isn't receiving the pod restart events either. Do you have any cases where this behaviour has worked for you?
So I found that it is partly working as expected and the CR is receiving reconcile events on liveness probe failure. However, it is when the start-up probe fails in one of the containers provisioned by the pod that it blocks any further reconciles. I assume this is because there would be no difference in the pod's overall status (NotReady) so it doesn't send the event?
p1
If u could share Reconcile function, it can help to find the problem.
I guess you might have missed setting the controller reference before creating the deployment. Consider adding SetControllerReference like this:
if err := ctrl.SetControllerReference(mycr, deployment, r.Scheme); err != nil {
log.Error(err, "unable to set controller reference")
return ctrl.Result{}, err
}
if err := r.Create(ctx, deployment); err != nil {
log.Error(err, "unable to create deployment")
return ctrl.Result{}, err
}
p2
thus. You don't need to include these resources in the Owns configuration unless u create them directly and they have had their controller reference set with SetControllerReference
Owns(&corev1.Pod{}).
Owns(&corev1.Secret{}).
Owns(&corev1.ConfigMap{}).
Owns(&corev1.Service{})
If you really need to watch these resources, you should use WatchesRawSource. However, make sure to implement proper filtering to handle them correctly.
p3
Since ownerReferences is a list, MatchEveryOwner is used to match the entire ownerReferences list of an object. Without MatchEveryOwner, the default behavior is to only match the primary ownerReference with controller=true.
However, since the Pod is not directly created by your controller but indirectly through the ReplicaSet controller, the Pod's ownerReferences will only contain a single reference pointing to the ReplicaSet. As a result, it won't match appv1alpha1.MyCR{} and trigger reconciliation.
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten