Bug: reconciliation of serversazureadonlyauthentications resources stuck in failed state
We are utilizing a helm chart to create a azuresql server. We only want the server to be available using aad login. To achieve this we are creating three resources:
- Server
- ServersAdministrator
- ServersAzureADOnlyAuthentication
All three resources are created simultaneously, and we let the reconciliation loop handle any dependencies between them. The Server is created as intended, and when this resource is ready the ServersAdministrator resource is created and an AAD admin is added to the server successfully. As the last step we want to update the server to only allow AAD login, but this resource is stuck in a failed state because it was applied when the Server was ready, and the ServersAdministrator was not yet ready.
error displayed:
invalidServerAADOnlyAuthNoAADAdminPropertyName AAD Admin is not configured, AAD Admin must be set before enabling/disabling AAD Only Authentication.
Expected behavior
We would expect the ServersAzureADOnlyAuthentication to keep retrying, and successfully reconcile when the ServersAdministrator resource was successfully applied, but this does not happen.
Now we have to manually delete the ServersAzureADOnlyAuthentication resource and then re-apply the ServersAzureADOnlyAuthentication manifest for it to work as intended.
Versions
AKS, running k8s version: 1.33.2 ASO version: mcr.microsoft.com/k8s/azureserviceoperator:v2.15.0
Azure Service Operator Version: What version of ASO are you using?
To Reproduce
apiVersion: sql.azure.com/v1api20211101
kind: Server
metadata:
name: test-app-dev-sql-server
namespace: test-namespace
spec:
owner:
name: test-app-dev-rg
location: westeurope
administratorLogin: tmp-admin
administratorLoginPassword:
key: sql-admin-password
name: tmp-password-secret
---
apiVersion: sql.azure.com/v1api20211101
kind: ServersAdministrator
metadata:
name: test-app-dev-sql-server-aad-admin
namespace: test-namespace
spec:
owner:
name: test-app-dev-sql-server
administratorType: ActiveDirectory
login: sqlaadadmin
sid: 12345678-1234-1234-1234-123456789abc
tenantId: 12345678-1234-1234-1234-123456789abc
---
apiVersion: sql.azure.com/v1api20211101
kind: ServersAzureADOnlyAuthentication
metadata:
name: test-app-dev-aad-only-auth
namespace: test-namespace
spec:
owner:
name: test-app-dev-sql-server
azureADOnlyAuthentication: true
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
Add any other context about the problem here.
Am I correct in assuming that if the Server object had exposed a status field 'aadAdminID' or 'aadAdminConfigured' the ServersAzureADOnlyAuthentication could check for changes to the value of this and trigger a new reconciliation whenever it was updated?
I suspect (but would need to verify) that the administrators property of the Server does exactly that - allowing us to see whether the ServersAdministrator has taken effect or not.
Can you share the full ASO logs for this error?
I suspect that we're not retrying on the error here and we should be. This is also somewhat related to this comment.
If we have the full logs we can put a patch in for this.