Trigger primary Custom Resource delete from managed Dependent Resource
Feature request
I have built a Kubernetes Operator based in Java Operator SDK v4.3.1 that handles a primary custom resource and several managed dependent resources.
One of the dependent resources is a Pod running a "critical" process.
If somehow (user/application/crash) this dependent resource is deleted externally, the primary custom resource should be marked for deletion.
What did you do?
This is an implementation example to describe the scenario:
@ControllerConfiguration(
name = "myresourcereconciler",
dependents = {
@Dependent(
name = "configmapmependentresource",
type = ConfigMapDependentResource.class,
reconcilePrecondition = ConfigMapReconcileCondition.class
),
@Dependent(
name = "criticalpoddependentresource",
type = CriticalPodDependentResource.class
),
@Dependent(
name = "servicedependentresource",
type = ServiceDependentResource.class
)
}
)
public class MyResourceReconciler implements Reconciler<MyResource>, Cleaner<MyResource> {
@Override
public UpdateControl<MyResource> reconcile(MyResource resource, Context<MyResource> context) throws Exception {
//Reconcile implementation
return updatedResourceStatus != null ? UpdateControl.patchStatus(updatedResource) : UpdateControl.noUpdate();
}
@Override
public DeleteControl cleanup(MyResource resource, Context<MyResource> context) {
//Cleanup implementation
return DeleteControl.defaultDelete();
}
}
...
@KubernetesDependent(labelSelector = MyResource .LABEL_SELECTOR)
public class CriticalPodDependentResource extends CRUDKubernetesDependentResource<Pod, MyResource > {
@Override
protected Pod desired(MyResource primary, Context<MyResource> context) {
// Desired Pod creation
return pod;
}
@Override
public void delete(MyResource primary, Context<MyResource> context) {
//Expected this operation to be called on dependent resource external delete event
context.getClient().resource(primary).delete();
}
}
What did you expect to see?
CriticalPodDependentResource.delete() operation called on dependent resource external deletion.
What did you see instead? Under which circumstances?
After deleting the critical dependent resource externally, the primary custom resource reconcile operation is triggered, and it was kind of expected...
Hi @rguillens ,
there are multiple things here:
- Deleted is called only if a precondition not holds on a DR or the whole Workflow is being cleaned up (thus the custom resource is being deleted). If a resource deleted by someone else the it is not called it is just reconciled and re-created.
- There is other aspect of this, how do you know if the resource been already created before. So for example the reconciliation starts and even it creates config map and service DRs, but suddenly the process/pod terminates there will be 2 resources, but not the pod. So as next step the operator starts how do you know, if the pod was there and deleted or just was not created? This means you need to store some state somewhere, that the pod was already created. See how state is supported in DR: https://javaoperatorsdk.io/docs/dependent-resources#external-state-tracking-dependent-resources although this is not necessarily your case.
So how I would solve this:
- store the state after the pod is created
- before the workflow reconciled check if the pod exists, if not but there is the state it was created before simply call delete on the primary custom resource using the client, and exit the reconciliation. Currently you can do this just by standalone workflows.
Note that some teams store the state in the status (like a flag that it was created), however this has a caveat, if it is in status it might not be present in next reconciliation (cache out of sync) is some rare cases. Therefore they also manage an in memory cache about the status that always has the latest version. Pls study how it is implemented in external state DR, you can easily manage this state correctly with a config map.
created issue that will make allow this to cover using the managed workflows: https://github.com/java-operator-sdk/java-operator-sdk/issues/1898
Thanks, @csviri for your recommendations
Actually, I do store some state related to some of the DRs and I'm sure when to delete the CR if something happens with a "critical" DR. This state is also reflected in the status at some point, but the state management don't rely on the CR status. Using something like a ConfigMap is way much better option to store this state in my scenario, as stated in: https://javaoperatorsdk.io/docs/dependent-resources#external-state-tracking-dependent-resources
I was also looking into the KubernetesDepentent annotation implementation and usages, I think this might be a good place to customize the DR reconcile lifecycle.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.
This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days.
As far I can see the explicit invocation will cover this. Feel free to reopen if not.