azure-service-operator icon indicating copy to clipboard operation
azure-service-operator copied to clipboard

Bug: reconciliation of serversazureadonlyauthentications resources stuck in failed state

Open andreasthuen opened this issue 3 months ago • 3 comments

We are utilizing a helm chart to create a azuresql server. We only want the server to be available using aad login. To achieve this we are creating three resources:

  • Server
  • ServersAdministrator
  • ServersAzureADOnlyAuthentication

All three resources are created simultaneously, and we let the reconciliation loop handle any dependencies between them. The Server is created as intended, and when this resource is ready the ServersAdministrator resource is created and an AAD admin is added to the server successfully. As the last step we want to update the server to only allow AAD login, but this resource is stuck in a failed state because it was applied when the Server was ready, and the ServersAdministrator was not yet ready.

error displayed: invalidServerAADOnlyAuthNoAADAdminPropertyName AAD Admin is not configured, AAD Admin must be set before enabling/disabling AAD Only Authentication.

Expected behavior

We would expect the ServersAzureADOnlyAuthentication to keep retrying, and successfully reconcile when the ServersAdministrator resource was successfully applied, but this does not happen.

Now we have to manually delete the ServersAzureADOnlyAuthentication resource and then re-apply the ServersAzureADOnlyAuthentication manifest for it to work as intended.

Versions

AKS, running k8s version: 1.33.2 ASO version: mcr.microsoft.com/k8s/azureserviceoperator:v2.15.0

Azure Service Operator Version: What version of ASO are you using?

To Reproduce


apiVersion: sql.azure.com/v1api20211101
kind: Server
metadata:
  name: test-app-dev-sql-server
  namespace: test-namespace
spec:
  owner:
    name: test-app-dev-rg
  location: westeurope
  administratorLogin: tmp-admin
  administratorLoginPassword:
    key: sql-admin-password
    name: tmp-password-secret   
---
apiVersion: sql.azure.com/v1api20211101
kind: ServersAdministrator
metadata:
  name: test-app-dev-sql-server-aad-admin
  namespace: test-namespace
spec:
  owner:
    name: test-app-dev-sql-server
  administratorType: ActiveDirectory
  login: sqlaadadmin
  sid: 12345678-1234-1234-1234-123456789abc
  tenantId: 12345678-1234-1234-1234-123456789abc
---
apiVersion: sql.azure.com/v1api20211101
kind: ServersAzureADOnlyAuthentication
metadata:
  name: test-app-dev-aad-only-auth
  namespace: test-namespace
spec:
  owner:
    name: test-app-dev-sql-server
  azureADOnlyAuthentication: true

Screenshots

If applicable, add screenshots to help explain your problem.

Additional context

Add any other context about the problem here.

andreasthuen avatar Sep 25 '25 08:09 andreasthuen

Am I correct in assuming that if the Server object had exposed a status field 'aadAdminID' or 'aadAdminConfigured' the ServersAzureADOnlyAuthentication could check for changes to the value of this and trigger a new reconciliation whenever it was updated?

andreasthuen avatar Sep 26 '25 05:09 andreasthuen

I suspect (but would need to verify) that the administrators property of the Server does exactly that - allowing us to see whether the ServersAdministrator has taken effect or not.

theunrepentantgeek avatar Sep 28 '25 20:09 theunrepentantgeek

Can you share the full ASO logs for this error?

I suspect that we're not retrying on the error here and we should be. This is also somewhat related to this comment.

If we have the full logs we can put a patch in for this.

matthchr avatar Sep 29 '25 22:09 matthchr