AKS icon indicating copy to clipboard operation
AKS copied to clipboard

[Feature] Integrate FluxCD extension in AKS with Azure DevOps without using PATs

Open hadberg opened this issue 1 year ago • 31 comments

Is your feature request related to a problem? Please describe. When configuring the Flux integration with Azure DevOps only option out-of-the-box appears to be through personal access tokens.

Describe the solution you'd like We would like to be able to integrate the flux extension with Azure DevOps through a service principal or workload identity. Basically using any authentication mechanism that is not tied to a personal user account.

Describe alternatives you've considered

  1. Using a PAT which we really would like to avoid
  2. A custom service principal solution with a container continuously requesting a new access token for the service principal and updating the flux pods with the returned token.

hadberg avatar Apr 02 '24 11:04 hadberg

Yes, please add this! For companies that enforce a short PAT lifetime, it is very cumbersome to re authenticate each Gitops configuration with a new PAT token every x number of days.

Carsondraper avatar Apr 23 '24 23:04 Carsondraper

We would like it very much if workload identities were implemented! See: https://fluxcd.io/flux/installation/configuration/workload-identity/#azure-workload-identity.

Right now, our work-around is to use an Azure Blob, and configuring the service principal such that the pipeline has access to it. Then, for the flux configuration we have configured a bucket source. This is not ideal, since an extra step is required to keep the repository and Azure Blob in sync.

bramvanneerven avatar Sep 09 '24 10:09 bramvanneerven

Flux 2.4.0 is now available with workload identity functionality built it for the source-controller: https://github.com/fluxcd/flux2/releases

https://github.com/fluxcd/source-controller/blob/v1.4.1/CHANGELOG.md Image

Would be great if this could be implemented the same way as the image and kustomize controllers.

jayctran avatar Oct 10 '24 14:10 jayctran

Since release of flux source controller v1.4.0 in flux bundle v2.4.0 this is possible now.

However, it requires providing spec.provider: 'azure' in the GitRepository object. This would mean that the ARM api Microsoft.KubernetesConfiguration/fluxConfigurations@2023-05-01 would also need to be updated to accommodate new property.

When we can expect that this will be available in AKS flux extension?

miqm avatar Oct 10 '24 14:10 miqm

That was fast: https://learn.microsoft.com/en-us/azure/azure-arc/kubernetes/extensions-release#1130-october-2024 ❤

miqm avatar Oct 15 '24 09:10 miqm

@miqm We still need properties.gitRepository.managedIdentity (similar to azureBlob) on Microsoft.KubernetesConfiguration/fluxConfigurations in order to configure workload identities and DevOps, since we cannot control the Flux bootstrap, right? Or have you found a work around for now?

bramvanneerven avatar Oct 15 '24 10:10 bramvanneerven

I hope that workload identity configuration will do the trick as described here: https://learn.microsoft.com/en-us/azure/azure-arc/kubernetes/tutorial-use-gitops-flux2?tabs=azure-cli#workload-identity-in-aks-clusters

proper annotation on pod is set: Image

but still - we'd need to set provider: 'azure'. But the first part is done - bump the version.

miqm avatar Oct 15 '24 11:10 miqm

Wow that's fast!

Sounds like we need a workaround, potentially patching the GitRepository for this to work until they come up with a native implementation.

jayctran avatar Oct 15 '24 13:10 jayctran

I see that v1.13.0 is available in uksouth and westcentralus - anyone with a cluster there? :)

miqm avatar Oct 16 '24 13:10 miqm

@bavneetsingh16 - could you take a look and provide details if we can set provider: 'azure' on GitRepository object using flux extension?

miqm avatar Oct 17 '24 13:10 miqm

We are working on releasing a new API version with the ability to specify the provider field via ARM.

dipti-pai avatar Oct 17 '24 21:10 dipti-pai

I manually patched GitRepository object by adding provider: azure and it works beautifully. However, doing a redeployment (I'm using bicep, so every CD run is a redeployment) clears the value. So we need to wait for the RP API update.

miqm avatar Oct 25 '24 10:10 miqm

@dipti-pai Any update on this?

miqm avatar Nov 06 '24 10:11 miqm

@miqm What should have been a relatively small change has got caught in a redesign of the swagger/ARM API layer leading to delays. The new ARM API version release might not be available before January, will keep you posted if something changes and we are able to release it sooner.

dipti-pai avatar Nov 12 '24 20:11 dipti-pai

I've used this with setting the specs by hand and it works great but it would awesome to have the ability to also onboard via the bicep, as I think you need to have it working via bicep/UI to have it appear in the portal

btrepp avatar Dec 03 '24 12:12 btrepp

@dipti-pai Happy New Year! Is the change-freeze over? Any update on the new ARM API for flux?

miqm avatar Jan 07 '25 14:01 miqm

@miqm Happy New Year! Thanks for following up. Deployments are resuming next week, however its happening in a phased manner prioritizing critical and security related issues first. Features will be released only after stabilizing the first set of releases, tentative ETA end of February.

Apologies for any inconvenience/delays caused due to this.

dipti-pai avatar Jan 08 '25 23:01 dipti-pai

@dipti-pai any update?

miqm avatar Mar 04 '25 15:03 miqm

@miqm, Sharing an update below:

The Azure service components are all rolled out. The ARM manifest that makes the 2024-11-01 API available in all prod regions in being rolled out right now. You can start using the provider using ARM templates once API version 2024-11-01 is available in your region with agent version >= 1.14.1 (released in Jan 2025) in your clusters.

Note that "az k8s-configuration" CLI command support is still pending and will be added once the 2024-11-01 version is available in all regions.

dipti-pai avatar Mar 07 '25 18:03 dipti-pai

@dipti-pai Awesome job! Can't wait to implement it.

One more question - did you by any chance allow to put "github" as provider value? So the github app can be used when connecting to Github-based repo as described here: https://fluxcd.io/flux/components/source/gitrepositories/#github ?

miqm avatar Mar 09 '25 20:03 miqm

@miqm, Sharing an update below:

The Azure service components are all rolled out. The ARM manifest that makes the 2024-11-01 API available in all prod regions in being rolled out right now. You can start using the provider using ARM templates once API version 2024-11-01 is available in your region with agent version >= 1.14.1 (released in Jan 2025) in your clusters.

Note that "az k8s-configuration" CLI command support is still pending and will be added once the 2024-11-01 version is available in all regions.

Is there a way of seeing the status? az provider show --namespace Microsoft.KubernetesConfiguration --query "resourceTypes[?resourceType=='fluxConfigurations']"

Seems to still show 2023-05-01/2024-04-01-preview as the latest version for me.

bt701 avatar Mar 10 '25 04:03 bt701

I have this API available in my region, the query @bt701 posted also lists the 2024-11-01 version. Just deployed using new API and accessing ADO works, although in one case the provider param wasn't set despite having it in bicep resource. I'll provide more details when I dig a bit more to find what was different between deployments.

miqm avatar Mar 19 '25 15:03 miqm

@dipti-pai I noticed that sometimes the provider: 'azure' property is not being applied to the cluster's FluxConfig and then to GitRepository objects. Re-running the deployment helps. Can you check?

miqm avatar Mar 20 '25 07:03 miqm

Hi @miqm, could you share an example deployment when this did not work as expected ? Thanks.

dipti-pai avatar Mar 20 '25 18:03 dipti-pai

CorrelationIds: 84e699de-abb3-4a83-a8c0-fa307699c2fc, 341ec71e-4f66-49f7-8cd0-9a4860047767, 876336eb-6f7d-429d-8f92-e3128000cb6c, 55ecc1d6-4d10-48cc-8b4f-517f83561a67. deployment had 3 fluxConfigurations, 2 (ending with -services) after several re-runs worked, the app didn't. Finally when I made a PATHCH request using az rest -m PATCH -u '/subscriptions/[sub]/resourcegroups/[rg]/providers/Microsoft.ContainerService/managedClusters/[aks]/providers/Microsoft.KubernetesConfiguration/fluxConfigurations/app?api-version=2024-11-01' --body '{"properties":{"gitRepository":{"provider":"azure"}}} command from CLI value was set.

CorrelaionId: c944f4c4-fcfb-434c-80dd-93fd7f6d1ce6. Was changing fom PAT to provider azure - in both configurations provider was not set. Re-running deployment no changes in IaC (using Bicep), just triggered re-run on Azure Pipelines (correlationId: 39dbf36e-18c7-4fdc-9143-e4038b1c8da2) set the provider value.

By it didn't work I mean when I examined the resources in cluster, the gitRepository didn't have the provider field and fluxconfig had provider: null

Hope this helps troubeshooting. The bicep fragment I use:

resource fluxConfigurationClusterApplications 'Microsoft.KubernetesConfiguration/fluxConfigurations@2024-11-01' = {
  name: 'app'
  scope: aks
  properties: {
    scope: 'cluster'
    namespace: SYSTEM_NAMESPACE
    sourceKind: 'GitRepository'
    gitRepository: {
      url: 'https://dev.azure.com/[org]/[project]/_git/[repo]'
      timeoutInSeconds: 600
      syncIntervalInSeconds: 300
      repositoryRef: {
        branch: fluxOptions.clusterBranch
      }
      provider: 'Azure'
    }

miqm avatar Mar 20 '25 18:03 miqm

@miqm The bicep release with provider support is still in progress. I will update here when its usable. Could you confirm if you see any issues if you work directly with ARM using templates/REST API ?

Note that "az k8s-configuration" support/Terraform support is also WIP.

dipti-pai avatar Mar 20 '25 20:03 dipti-pai

As far as I know (and I know bicep very well) bicep compiles to arm and the arm json is used for making the deployment, so there should be no reason why arm would work and bicep wouldn't. Unless ARM backend is using some other api endpoints than are publicly available.

miqm avatar Mar 20 '25 21:03 miqm

Looks like bicep build does proceed with the deployment even though the types are not released yet with the below warning -

Warning BCP081: Resource type "Microsoft.KubernetesConfiguration/fluxConfigurations@2024-11-01" does not have types available. Bicep is unable to validate resource properties prior to deployment, but this will not block the resource from being deployed. [https://aka.ms/bicep/core-diagnostics#BCP081]

I tried a bicep template with provider set in westus2 region and did not see any issues in provider being set. I will look into the correlation IDs and update if I find something.

dipti-pai avatar Mar 20 '25 23:03 dipti-pai

@miqm I looked into the correlation IDs and the only thing I can say for sure is that for the first request, the request the agent processed on the cluster did not have provider set to azure, in the second request, it did. The request body in the RP/dataplane layers is not logged, hence it is hard to say which of the internal layers dropped the provider field in the first request.

Could you help me understand if this was a one-off OR are you are noticing this behavior frequently?

dipti-pai avatar Mar 21 '25 19:03 dipti-pai

Hi,

Since that requests I didn't make any deployments to the cluster where I have flux + ADO setup, therefore I can't say yes or no. Those requests were based on this same template so it's very odd that one time it worked and another it didn't. I'll try to trigger a re-run next week and see if the provider field will stay set or will it change to null and I'll let you know l the result.

miqm avatar Mar 21 '25 20:03 miqm