crossplane-runtime icon indicating copy to clipboard operation
crossplane-runtime copied to clipboard

Resolving refs finds objects in terminating state

Open larhauga opened this issue 3 years ago • 5 comments

What happened?

When using references to other objects, the object is resolved even though the referenced object is in a terminating state.

How can we reproduce it?

We can reproduce this with the usage of AWS SecurityGroups where AWS checks if the SG ID is in use elsewhere.

apiVersion: ec2.aws.crossplane.io/v1beta1
kind: SecurityGroup
metadata:
  name: app-sg
  labels:
    type: application
    app: foobar
spec:
  forProvider:
    vpcId: XXX
    region: 
    description: Application security group
    groupName: app-sg
  providerConfigRef:
    name: default

apiVersion: ec2.aws.crossplane.io/v1beta1
kind: SecurityGroup
metadata:
  name: redis-sg
spec:
  forProvider:
    vpcId: XXX
    region: xxx
    description: regis-sg
    groupName: redis-sg
    ingress:
      - ipProtocol: tcp
        fromPort: 6379
        toPort: 6379
        userIdGroupPairs:
        - groupIdSelector:
            matchLabels:
              type: application
              app: foobar
  providerConfigRef:
    name: default
kubectl apply -f application-sg.yaml -f redis-sg.yaml
kubectl get securitygroups
app-sg                                     True    True     sg-xxx   vpc-xxx   85s
redis-sg                                   True    True     sg-xxx   vpc-xxx   65s

Delete the application security group that is referenced by redis security group, which will hang because of blocking AWS api.

kubectl delete -f application-sg.yaml # will hang on blocking

This will result in the following, since redis-sg now depends on app-sg, and on reconcile the object reference will see the app-sg even though it is in an unready state.

app-sg                                     False   False    sg-xxx   vpc-xxx   83s
redis-sg                                   True    True     sg-xxx   vpc-xxx   65s

The event on app-sg is the following:

failed to delete the SecurityGroup resource: DependencyViolation: resource sg-xxx has a dependent object

Expected behaviour

In #250 it is describe that it should work with clearing the reference, but this does not seem to be the case.

kubectl edit -f redis-sg.yaml # remove sg- reference

I assume that there is a missing condition at references.go as the destination object is not checked.

What environment did it happen in?

Crossplane version: 1.1.0 Cloud provider: provider-aws 0.17 + PR https://github.com/crossplane/provider-aws/pull/614 Kubernetes version: 1.18 Kubernetes distribution: EKS

larhauga avatar Apr 15 '21 07:04 larhauga

What is the scenario that requires you to reference a terminating object? A case I can think of is that you reference an object, it's resolved, but now you'd like to delete and re-create that referenced object and expect that it picks up the new value. If that's the case, https://github.com/crossplane/crossplane-runtime/pull/328 should help you set a policy for the reference to repeatedly resolve even if it's already resolved to a value. Would that help?

muvaf avatar Apr 21 '22 22:04 muvaf

Yeah, this is about being able to recreate (or just delete) a composition or managed object that is referenced from another. For example, if you have an App and a DB, and the DB references the App in an inbound security group rule.

#328 will definitively help, but wouldn't it be 50-50 selecting the terminating object instead? So after 6 reconciles there's a 1% chance of all the reconciles selecting the terminating object? And AWS will not allow that object to terminate because of the reference.

And if the new object is "named" in AWS and not "random ID"-based, there will never be two objects available at the same time. In that case, the situation will never clear up.

So I think during resolution, we should still skip terminating objects, but the implementation will be much simpler thanks to #328.

chlunde avatar May 01 '22 08:05 chlunde

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Aug 13 '22 07:08 stale[bot]

/fresh

chlunde avatar Sep 13 '22 09:09 chlunde

Crossplane does not currently have enough maintainers to address every issue and pull request. This issue has been automatically marked as stale because it has had no activity in the last 90 days. It will be closed in 14 days if no further activity occurs. Leaving a comment starting with /fresh will mark this issue as not stale.

github-actions[bot] avatar Sep 04 '24 01:09 github-actions[bot]