[Feature] Integrate FluxCD extension in AKS with Azure DevOps without using PATs
Is your feature request related to a problem? Please describe. When configuring the Flux integration with Azure DevOps only option out-of-the-box appears to be through personal access tokens.
Describe the solution you'd like We would like to be able to integrate the flux extension with Azure DevOps through a service principal or workload identity. Basically using any authentication mechanism that is not tied to a personal user account.
Describe alternatives you've considered
- Using a PAT which we really would like to avoid
- A custom service principal solution with a container continuously requesting a new access token for the service principal and updating the flux pods with the returned token.
Yes, please add this! For companies that enforce a short PAT lifetime, it is very cumbersome to re authenticate each Gitops configuration with a new PAT token every x number of days.
We would like it very much if workload identities were implemented! See: https://fluxcd.io/flux/installation/configuration/workload-identity/#azure-workload-identity.
Right now, our work-around is to use an Azure Blob, and configuring the service principal such that the pipeline has access to it. Then, for the flux configuration we have configured a bucket source. This is not ideal, since an extra step is required to keep the repository and Azure Blob in sync.
Flux 2.4.0 is now available with workload identity functionality built it for the source-controller: https://github.com/fluxcd/flux2/releases
https://github.com/fluxcd/source-controller/blob/v1.4.1/CHANGELOG.md
Would be great if this could be implemented the same way as the image and kustomize controllers.
Since release of flux source controller v1.4.0 in flux bundle v2.4.0 this is possible now.
However, it requires providing spec.provider: 'azure' in the GitRepository object. This would mean that the ARM api Microsoft.KubernetesConfiguration/fluxConfigurations@2023-05-01 would also need to be updated to accommodate new property.
When we can expect that this will be available in AKS flux extension?
That was fast: https://learn.microsoft.com/en-us/azure/azure-arc/kubernetes/extensions-release#1130-october-2024 ❤
@miqm We still need properties.gitRepository.managedIdentity (similar to azureBlob) on Microsoft.KubernetesConfiguration/fluxConfigurations in order to configure workload identities and DevOps, since we cannot control the Flux bootstrap, right? Or have you found a work around for now?
I hope that workload identity configuration will do the trick as described here: https://learn.microsoft.com/en-us/azure/azure-arc/kubernetes/tutorial-use-gitops-flux2?tabs=azure-cli#workload-identity-in-aks-clusters
proper annotation on pod is set:
but still - we'd need to set provider: 'azure'. But the first part is done - bump the version.
Wow that's fast!
Sounds like we need a workaround, potentially patching the GitRepository for this to work until they come up with a native implementation.
I see that v1.13.0 is available in uksouth and westcentralus - anyone with a cluster there? :)
@bavneetsingh16 - could you take a look and provide details if we can set provider: 'azure' on GitRepository object using flux extension?
We are working on releasing a new API version with the ability to specify the provider field via ARM.
I manually patched GitRepository object by adding provider: azure and it works beautifully. However, doing a redeployment (I'm using bicep, so every CD run is a redeployment) clears the value. So we need to wait for the RP API update.
@dipti-pai Any update on this?
@miqm What should have been a relatively small change has got caught in a redesign of the swagger/ARM API layer leading to delays. The new ARM API version release might not be available before January, will keep you posted if something changes and we are able to release it sooner.
I've used this with setting the specs by hand and it works great but it would awesome to have the ability to also onboard via the bicep, as I think you need to have it working via bicep/UI to have it appear in the portal
@dipti-pai Happy New Year! Is the change-freeze over? Any update on the new ARM API for flux?
@miqm Happy New Year! Thanks for following up. Deployments are resuming next week, however its happening in a phased manner prioritizing critical and security related issues first. Features will be released only after stabilizing the first set of releases, tentative ETA end of February.
Apologies for any inconvenience/delays caused due to this.
@dipti-pai any update?
@miqm, Sharing an update below:
The Azure service components are all rolled out. The ARM manifest that makes the 2024-11-01 API available in all prod regions in being rolled out right now. You can start using the provider using ARM templates once API version 2024-11-01 is available in your region with agent version >= 1.14.1 (released in Jan 2025) in your clusters.
Note that "az k8s-configuration" CLI command support is still pending and will be added once the 2024-11-01 version is available in all regions.
@dipti-pai Awesome job! Can't wait to implement it.
One more question - did you by any chance allow to put "github" as provider value? So the github app can be used when connecting to Github-based repo as described here: https://fluxcd.io/flux/components/source/gitrepositories/#github ?
@miqm, Sharing an update below:
The Azure service components are all rolled out. The ARM manifest that makes the 2024-11-01 API available in all prod regions in being rolled out right now. You can start using the provider using ARM templates once API version 2024-11-01 is available in your region with agent version >= 1.14.1 (released in Jan 2025) in your clusters.
Note that "az k8s-configuration" CLI command support is still pending and will be added once the 2024-11-01 version is available in all regions.
Is there a way of seeing the status? az provider show --namespace Microsoft.KubernetesConfiguration --query "resourceTypes[?resourceType=='fluxConfigurations']"
Seems to still show 2023-05-01/2024-04-01-preview as the latest version for me.
I have this API available in my region, the query @bt701 posted also lists the 2024-11-01 version. Just deployed using new API and accessing ADO works, although in one case the provider param wasn't set despite having it in bicep resource. I'll provide more details when I dig a bit more to find what was different between deployments.
@dipti-pai I noticed that sometimes the provider: 'azure' property is not being applied to the cluster's FluxConfig and then to GitRepository objects. Re-running the deployment helps. Can you check?
Hi @miqm, could you share an example deployment when this did not work as expected ? Thanks.
CorrelationIds: 84e699de-abb3-4a83-a8c0-fa307699c2fc, 341ec71e-4f66-49f7-8cd0-9a4860047767, 876336eb-6f7d-429d-8f92-e3128000cb6c, 55ecc1d6-4d10-48cc-8b4f-517f83561a67. deployment had 3 fluxConfigurations, 2 (ending with -services) after several re-runs worked, the app didn't. Finally when I made a PATHCH request using az rest -m PATCH -u '/subscriptions/[sub]/resourcegroups/[rg]/providers/Microsoft.ContainerService/managedClusters/[aks]/providers/Microsoft.KubernetesConfiguration/fluxConfigurations/app?api-version=2024-11-01' --body '{"properties":{"gitRepository":{"provider":"azure"}}} command from CLI value was set.
CorrelaionId: c944f4c4-fcfb-434c-80dd-93fd7f6d1ce6. Was changing fom PAT to provider azure - in both configurations provider was not set. Re-running deployment no changes in IaC (using Bicep), just triggered re-run on Azure Pipelines (correlationId: 39dbf36e-18c7-4fdc-9143-e4038b1c8da2) set the provider value.
By it didn't work I mean when I examined the resources in cluster, the gitRepository didn't have the provider field and fluxconfig had provider: null
Hope this helps troubeshooting. The bicep fragment I use:
resource fluxConfigurationClusterApplications 'Microsoft.KubernetesConfiguration/fluxConfigurations@2024-11-01' = {
name: 'app'
scope: aks
properties: {
scope: 'cluster'
namespace: SYSTEM_NAMESPACE
sourceKind: 'GitRepository'
gitRepository: {
url: 'https://dev.azure.com/[org]/[project]/_git/[repo]'
timeoutInSeconds: 600
syncIntervalInSeconds: 300
repositoryRef: {
branch: fluxOptions.clusterBranch
}
provider: 'Azure'
}
@miqm The bicep release with provider support is still in progress. I will update here when its usable. Could you confirm if you see any issues if you work directly with ARM using templates/REST API ?
Note that "az k8s-configuration" support/Terraform support is also WIP.
As far as I know (and I know bicep very well) bicep compiles to arm and the arm json is used for making the deployment, so there should be no reason why arm would work and bicep wouldn't. Unless ARM backend is using some other api endpoints than are publicly available.
Looks like bicep build does proceed with the deployment even though the types are not released yet with the below warning -
Warning BCP081: Resource type "Microsoft.KubernetesConfiguration/fluxConfigurations@2024-11-01" does not have types available. Bicep is unable to validate resource properties prior to deployment, but this will not block the resource from being deployed. [https://aka.ms/bicep/core-diagnostics#BCP081]
I tried a bicep template with provider set in westus2 region and did not see any issues in provider being set. I will look into the correlation IDs and update if I find something.
@miqm I looked into the correlation IDs and the only thing I can say for sure is that for the first request, the request the agent processed on the cluster did not have provider set to azure, in the second request, it did. The request body in the RP/dataplane layers is not logged, hence it is hard to say which of the internal layers dropped the provider field in the first request.
Could you help me understand if this was a one-off OR are you are noticing this behavior frequently?
Hi,
Since that requests I didn't make any deployments to the cluster where I have flux + ADO setup, therefore I can't say yes or no. Those requests were based on this same template so it's very odd that one time it worked and another it didn't. I'll try to trigger a re-run next week and see if the provider field will stay set or will it change to null and I'll let you know l the result.