Port status not being updated after being attached to a server
Using these two resources:
apiVersion: openstack.k-orc.cloud/v1alpha1
kind: Port
metadata:
name: create-full
spec:
cloudCredentialsRef:
cloudName: openstack
secretName: openstack-clouds
managementPolicy: managed
networkRef: create-full
resource:
addresses:
- subnetRef: create-full
---
apiVersion: openstack.k-orc.cloud/v1alpha1
kind: Server
metadata:
name: create-full
spec:
cloudCredentialsRef:
cloudName: openstack
secretName: openstack-clouds
managementPolicy: managed
resource:
name: create-full-override
imageRef: create-full
flavorRef: server-flavor
ports:
- portRef: create-full
The port status is not being updated after the server is ACTIVE:
apiVersion: openstack.k-orc.cloud/v1alpha1
kind: Port
metadata:
creationTimestamp: "2025-02-28T13:27:51Z"
finalizers:
- openstack.k-orc.cloud/port
- openstack.k-orc.cloud/server
generation: 1
name: create-full
namespace: kuttl-test-correct-oarfish
resourceVersion: "136683"
uid: 0b2d2e3d-ba2c-4d67-b3be-a3f60c9ff9b8
spec:
cloudCredentialsRef:
cloudName: openstack
secretName: openstack-clouds
managementPolicy: managed
networkRef: create-full
resource:
addresses:
- subnetRef: create-full
status:
conditions:
- lastTransitionTime: "2025-02-28T13:28:26Z"
message: OpenStack resource is available
observedGeneration: 1
reason: Success
status: "True"
type: Available
- lastTransitionTime: "2025-02-28T13:28:26Z"
message: OpenStack resource is up to date
observedGeneration: 1
reason: Success
status: "False"
type: Progressing
id: e2b8af31-7243-4845-a397-d52e1f1b74e4
resource:
adminStateUp: true
createdAt: "2025-02-28T13:28:20Z"
description: ""
deviceID: ""
deviceOwner: ""
fixedIPs:
- ip: 192.168.200.253
subnetID: 9b48c072-4d83-4eab-be6e-bb0745aeed16
macAddress: fa:16:3e:bd:db:84
name: create-full
projectID: 63df4263dea34861ab8989f03f47c5c3
propagateUplinkStatus: false
revisionNumber: 1
status: DOWN
updatedAt: "2025-02-28T13:28:21Z"
We should see the deviceID and deviceOwner fields being set, and status changing to UP.
I think this is second-tier priority for now, especially as we're not really attempting to have resource status remain up to date, yet. That said, I think it's slightly more than 'regular' managing resource status over time, as it's status in direct response to a provisioning action. So maybe 1.5 tier?
I have some thoughts as to how we could address it which I'll add later.
The end goal here is to have deviceID reported in the port's resource status. Resource status is owned by the port controller, so we need some way to indicate to the port controller that it needs to update its status. An obvious way to achieve this is to update its spec.
I propose we add a new Spec (that's spec, NOT resource spec!) field to Port: attachment. At a minimum this would be an object reference to the object which has attached itself to this port. Because this would be an update to the spec this would trigger a bump in generation, which would indicate to the port controller that there is something new to reconcile. The field would be written by whichever controller is attaching to the port. If we only write this change after the attachment has occured, the port controller should expect to observe it immediately, and therefore not have to poll.
This also allows us to address a problem we hadn't yet noticed: that ORC currently allows multiple objects to reference the same port. With this change they can still reference the same port, but we have an opportunity to set a useful status about it. This would likely be a terminal error, but we would need to ensure that shouldReconcile handles this correctly if the target port is updated to no longer have an attachment.
Volumes have the same problem. I'm taking it.
While working on this issue for volumes (#528), I realized the solution might be a little more complex than highlighted here:
- we get tricky to handle timing issues:
- you can't for example attach volumes to a server while it's being built
- after you've issued the API call to attach a volume to a server, it's not being reflected immediately in OpenStack
- you also need to refresh the "attachments" list of an attached resource when the server is gone, and again this is not reflected immediately after the delete API call is issued.
- how do you handle situation where the OpenStack attachment call succeeded but the k8s call to update the object didn't? I've chose to ignore failures updating the objects.
I think this solution, while it kinda works, is a bit fragile and we would have to think for better alternatives for the longer term, where the attachable resource registers for events, similar to what we do for dependencies.