external-secrets icon indicating copy to clipboard operation
external-secrets copied to clipboard

chore: azure sdk update

Open hauswio opened this issue 7 months ago • 41 comments

Problem Statement

Azure sdk update and custom cloud support

Related Issue

Fixes #5151

Proposed Changes

  1. Feature Flag Configuration
  • Added UseAzureSDK *bool field to AzureKVProvider struct
  • Defaults to false (legacy SDK) for backward compatibility
  • Users can opt-in to new SDK by setting useAzureSDK: true
  1. Dual SDK Architecture
  • Legacy path: Uses existing go-autorest SDK (default)
  • New path: Uses azure-sdk-for-go with azcore, azidentity,
    and Key Vault specific packages
  • Both implementations coexist safely in the same codebase
  1. Complete Method Support Both SDK implementations support all Key Vault operations:
  • GetSecret - Retrieve individual secrets, keys, certificates
  • GetAllSecrets - List and retrieve multiple secrets with filtering
  • GetSecretMap - Retrieve secret collections as maps
  • PushSecret - Set/update secrets, keys, certificates with
    metadata
  1. Authentication Support All three authentication methods work with both SDKs:
  • Service Principal: Client ID/Secret and Client Certificate
    auth
  • Managed Identity: System and user-assigned identities
  • Workload Identity: Kubernetes service account token exchange
  1. Cloud Environment Support Both implementations support all Azure environments:
  • Public Cloud (default)
  • US Government Cloud
  • China Cloud
  • German Cloud

The new SDK supports custom endpoint via the AzureStackCloud environment

Migration Path for Users

Current Users (No Action Required)

  • Existing configurations continue working unchanged
  • Legacy SDK remains the default behavior

Opt-in to New SDK apiVersion: external-secrets.io/v1 kind: SecretStore metadata: name: azure-keyvault spec: provider: azurekv: vaultUrl: "https://my-vault.vault.azure.net/" useAzureSDK: true # Enable new SDK authType: ServicePrincipal # ... rest of configuration

Architecture:

  • Route all operations through feature flag check
  • Separate initialization functions for each SDK
  • Parallel method implementations (legacy vs new)
  • Unified error handling that normalizes responses
  • Used explicit AzureStackCloud environment type instead of checking custom config as precedence (better performance for common cases)
  • Legacy SDK explicitly rejects custom cloud configuration with helpful error messages

Checklist

  • [x] I have read the contribution guidelines
  • [x] All commits are signed with git commit --signoff
  • [x] My changes have reasonable test coverage
  • [x] All tests pass with make test
  • [x] I ensured my PR is ready for review with make reviewable

hauswio avatar Aug 19 '25 14:08 hauswio

FIXED: I had the python version of yq instead of the go version, thanks to debian.

Getting the following error when running the make commands:

jq: Unknown option -p=json

cursory look didn't make much sense to me, figured I'd comment in case yall have seen this before.

hauswio avatar Aug 19 '25 14:08 hauswio

/ok-to-test sha=6de7540d1a69f706ac0e2172dc1c2ba17bcd1914

moolen avatar Aug 19 '25 22:08 moolen

@moolen & @Skarlso

I believe I've implemented most of Skarlso's requested changes. I'm working going to work on the test updates soon. I see the sonarqube issues, but they seem to be flagging the same level of layering that exists in some of the legacy functions. I assume implementing a helper func to handle the secret existence checks on the setters would alleviate the issues?

I'll try to get the test updates to yall tomorrow.

hauswio avatar Aug 20 '25 20:08 hauswio

@Skarlso I know you wanted the sonarqube issues addressed, maybe you can help me understand them? Am I reading right that those functions have too many layers of loops/conditionals and its failing for cognitive load? The layers should match the old implementation, but looking at it I could do a helper func to move the secret checks out of each of those functions. Theoretically that could solve the issues.

hauswio avatar Aug 21 '25 17:08 hauswio

@hauswio I'm fine if you leave most it but please refactor getAllSecretsWithNewSDK at the very least. :)

For example checkTag and checkName could be small functions that return a boolean. And if false, continue.

Or the entire internal for loop can be a function on its own. Other than that, I ignored the rest of the errors.

Skarlso avatar Aug 22 '25 06:08 Skarlso

@moolen & @Skarlso

Just checking in if the recent changes meet your requests.

hauswio avatar Aug 26 '25 13:08 hauswio

Can you please run make check-diff and then commit the changes it leaves? :)

Skarlso avatar Aug 27 '25 04:08 Skarlso

Hi! While working on #5113, I realized it would be beneficial to have a single interface to manage authorizations with Azure.

Same auth package could be reused by SecretStore provider, ACR generator and any other stuff related to Azure auth. We only need to not construct client, but return azcore.TokenCredential instead.

Here is the code

bonddim avatar Aug 27 '25 14:08 bonddim

Hi! While working on #5113, I realized it would be beneficial to have a single interface to manage authorizations with Azure.

Same auth package could be reused by SecretStore provider, ACR generator and any other stuff related to Azure auth. We only need to not construct client, but return azcore.TokenCredential instead.

Here is the code

@bonddim I'll look into it, but do I understand you want to change the SecretProvider object to a single token auth type instead of the separate client types?

hauswio avatar Aug 28 '25 12:08 hauswio

Hi! While working on #5113, I realized it would be beneficial to have a single interface to manage authorizations with Azure. Same auth package could be reused by SecretStore provider, ACR generator and any other stuff related to Azure auth. We only need to not construct client, but return azcore.TokenCredential instead. Here is the code

@bonddim I'll look into it, but do I understand you want to change the SecretProvider object to a single token auth type instead of the separate client types?

Nope. All existing auth methods are supported

I mean, that we may create client for required resource when needed.

Upd: I didn't run e2e tests yet. And tried to make Azure auth options identical (as possible) without breaking change.

bonddim avatar Aug 28 '25 14:08 bonddim

@hauswio Hello 👋!

Is this ready for a review now?

Skarlso avatar Sep 04 '25 04:09 Skarlso

@Skarlso

Sorry for the lack of activity, I've been deploying these changes built locally to a few of our environments and handling other work. I believe this is ready to review for the associated issue. I defer to you on @bonddim request. From digging into it, it would probably double the size of this PR and require handling the feature flag across directories.

Thank you for your patience.

hauswio avatar Sep 05 '25 10:09 hauswio

Oh yeah, definitely don't do that. @bonddim your suggestion should go into a separate PR. This is only dealing with SDK update.

Skarlso avatar Sep 05 '25 10:09 Skarlso

/ok-to-test sha=1763c2389028e403fd323c9f047be9448ffac2a1

Skarlso avatar Sep 05 '25 12:09 Skarlso

2025-09-05T12:34:34.5249551Z     Did not match. Expected: &Secret***ObjectMeta:***      0 0001-01-01 00:00:00 +0000 UTC <nil> <nil> map[] map[] [] [] []***,Data:map[string][]byte***azkv-key-new-sdk: [123 34 107 105 100 34 58 34 104 116 116 112 115 58 47 47 101 115 111 45 101 50 101 45 107 101 121 118 97 117 108 116 46 118 97 117 108 116 46 97 122 117 114 101 46 110 101 116 47 107 101 121 115 47 101 50 101 45 116 101 115 116 115 45 101 115 111 45 97 122 117 114 101 45 107 101 121 116 121 112 101 45 113 110 114 100 55 45 107 101 121 116 101 115 116 47 102 102 98 53 99 102 97 101 56 56 52 100 52 53 55 49 56 54 101 52 52 99 97 50 57 51 97 50 50 49 102 49 34 44 34 107 116 121 34 58 34 82 83 65 34 44 34 107 101 121 95 111 112 115 34 58 91 34 101 110 99 114 121 112 116 34 44 34 100 101 99 114 121 112 116 34 44 34 115 105 103 110 34 44 34 118 101 114 105 102 121 34 44 34 119 114 97 112 75 101 121 34 44 34 117 110 119 114 97 112 75 101 121 34 93 44 34 110 34 58 34 111 105 68 66 87 55 85 114 114 101 56 106 95 45 110 66 48 114 50 67 97 122 65 79 107 90 74 120 106 115 99 86 112 113 72 48 79 104 112 65 111 105 75 56 118 109 68 107 99 82 50 57 56 83 78 57 97 66 83 117 73 50 122 52 76 67 83 72 75 68 78 90 57 118 77 107 72 55 108 75 116 78 109 49 117 111 49 117 86 50 80 48 82 122 79 98 75 103 70 48 105 80 90 50 116 48 100 101 73 113 45 85 76 107 52 120 84 85 82 73 109 76 71 55 78 78 75 81 49 117 97 75 54 97 50 51 54 116 86 79 78 79 76 120 118 114 105 115 56 57 90 88 95 90 120 86 109 116 101 110 77 110 67 85 111 87 45 69 98 82 114 105 72 104 113 66 51 88 72 80 104 117 56 72 74 73 66 90 57 118 50 120 56 56 78 89 89 84 45 115 115 78 121 114 79 112 104 88 73 72 82 54 74 71 112 99 121 48 121 110 106 118 100 115 74 88 112 105 72 68 84 100 48 70 110 57 67 50 103 88 117 101 84 72 70 90 72 109 54 71 57 112 97 102 109 75 84 101 76 113 73 52 99 45 79 77 98 99 85 54 117 54 45 54 72 101 70 103 89 109 104 102 87 103 54 54 81 70 84 87 74 120 107 108 55 118 69 107 87 117 106 85 99 105 119 85 111 82 89 103 71 89 87 49 100 95 85 106 85 52 73 110 109 70 117 69 116 66 71 65 77 109 84 86 45 87 98 116 84 99 98 81 34 44 34 101 34 58 34 65 81 65 66 34 125],***,Type:Opaque,StringData:map[string]string***,Immutable:nil,***, Got: &Secret***ObjectMeta:***target-secret  e2e-tests-eso-azure-keytype-qnrd7  cc225f35-600c-4fd3-8106-3b2c977feb3e 4511 0 2025-09-05 12:32:34 +0000 UTC <nil> <nil> map[] map[] [***external-secrets.io/v1 ExternalSecret e2e-es 6459b3a5-994c-41c1-869c-46073fa608f6 0xc000ea967e 0xc000ea967f***] [] [***externalsecrets.external-secrets.io/e2e-es Update v1 2025-09-05 12:32:34 +0000 UTC FieldsV1 ***"f:data":***".":***,"f:azkv-key-new-sdk":***,"f:metadata":***"f:annotations":***".":***,"f:reconcile.external-secrets.io/data-hash":***,"f:labels":***".":***,"f:reconcile.external-secrets.io/created-by":***,"f:reconcile.external-secrets.io/managed":***,"f:ownerReferences":***".":***,"k:***\"uid\":\"6459b3a5-994c-41c1-869c-46073fa608f6\"***":***,"f:type":*** ***]***,Data:map[string][]byte***azkv-key-new-sdk: [123 34 101 34 58 34 65 81 65 66 34 44 34 107 101 121 95 111 112 115 34 58 91 34 101 110 99 114 121 112 116 34 44 34 100 101 99 114 121 112 116 34 44 34 115 105 103 110 34 44 34 118 101 114 105 102 121 34 44 34 119 114 97 112 75 101 121 34 44 34 117 110 119 114 97 112 75 101 121 34 93 44 34 107 105 100 34 58 34 104 116 116 112 115 58 47 47 101 115 111 45 101 50 101 45 107 101 121 118 97 117 108 116 46 118 97 117 108 116 46 97 122 117 114 101 46 110 101 116 47 107 101 121 115 47 101 50 101 45 116 101 115 116 115 45 101 115 111 45 97 122 117 114 101 45 107 101 121 116 121 112 101 45 113 110 114 100 55 45 107 101 121 116 101 115 116 47 102 102 98 53 99 102 97 101 56 56 52 100 52 53 55 49 56 54 101 52 52 99 97 50 57 51 97 50 50 49 102 49 34 44 34 107 116 121 34 58 34 82 83 65 34 44 34 110 34 58 34 111 105 68 66 87 55 85 114 114 101 56 106 95 45 110 66 48 114 50 67 97 122 65 79 107 90 74 120 106 115 99 86 112 113 72 48 79 104 112 65 111 105 75 56 118 109 68 107 99 82 50 57 56 83 78 57 97 66 83 117 73 50 122 52 76 67 83 72 75 68 78 90 57 118 77 107 72 55 108 75 116 78 109 49 117 111 49 117 86 50 80 48 82 122 79 98 75 103 70 48 105 80 90 50 116 48 100 101 73 113 45 85 76 107 52 120 84 85 82 73 109 76 71 55 78 78 75 81 49 117 97 75 54 97 50 51 54 116 86 79 78 79 76 120 118 114 105 115 56 57 90 88 95 90 120 86 109 116 101 110 77 110 67 85 111 87 45 69 98 82 114 105 72 104 113 66 51 88 72 80 104 117 56 72 74 73 66 90 57 118 50 120 56 56 78 89 89 84 45 115 115 78 121 114 79 112 104 88 73 72 82 54 74 71 112 99 121 48 121 110 106 118 100 115 74 88 112 105 72 68 84 100 48 70 110 57 67 50 103 88 117 101 84 72 70 90 72 109 54 71 57 112 97 102 109 75 84 101 76 113 73 52 99 45 79 77 98 99 85 54 117 54 45 54 72 101 70 103 89 109 104 102 87 103 54 54 81 70 84 87 74 120 107 108 55 118 69 107 87 117 106 85 99 105 119 85 111 82 89 103 71 89 87 49 100 95 85 106 85 52 73 110 109 70 117 69 116 66 71 65 77 109 84 86 45 87 98 116 84 99 98 81 34 125],***,Type:Opaque,StringData:map[string]string***,Immutable:nil,***

Frigging Assert unhelpful. :D It's failing the Azure e2e test though so sadly, this is something related.

Skarlso avatar Sep 05 '25 12:09 Skarlso

Managed to reverse it:

		- `{"kid":"https://eso-e2e-keyvault.vault.azure.net/keys/e2e-tests-eso-azure-keytype-qnrd7-keytest/ffb5cfae884d457186e44ca293a221f1","kty":"RSA","key_ops":["encrypt","decrypt","sign","verify","wrapKey","unwrapKey"],"n":"oiDBW7Urre8j_-nB0r2CazAOkZJxjscVpqH0OhpAoiK8vmDkcR298SN9aBSuI2z4LCSHKDNZ9vMkH7lKtNm1uo1uV2P0RzObKgF0iPZ2t0deIq-ULk4xTURImLG7NNKQ1uaK6a236tVONOLxvris89ZX_ZxVmtenMnCUoW-EbRriHhqB3XHPhu8HJIBZ9v2x88NYYT-ssNyrOphXIHR6JGpcy0ynjvdsJXpiHDTd0Fn9C2gXueTHFZHm6G9pafmKTeLqI4c-OMbcU6u6-6HeFgYmhfWg66QFTWJxkl7vEkWujUciwUoRYgGYW1d_UjU4InmFuEtBGAMmTV-WbtTcbQ","e":"AQAB"}`
		- `{"e":"AQAB","key_ops":["encrypt","decrypt","sign","verify","wrapKey","unwrapKey"],"kid":"https://eso-e2e-keyvault.vault.azure.net/keys/e2e-tests-eso-azure-keytype-qnrd7-keytest/ffb5cfae884d457186e44ca293a221f1","kty":"RSA","n":"oiDBW7Urre8j_-nB0r2CazAOkZJxjscVpqH0OhpAoiK8vmDkcR298SN9aBSuI2z4LCSHKDNZ9vMkH7lKtNm1uo1uV2P0RzObKgF0iPZ2t0deIq-ULk4xTURImLG7NNKQ1uaK6a236tVONOLxvris89ZX_ZxVmtenMnCUoW-EbRriHhqB3XHPhu8HJIBZ9v2x88NYYT-ssNyrOphXIHR6JGpcy0ynjvdsJXpiHDTd0Fn9C2gXueTHFZHm6G9pafmKTeLqI4c-OMbcU6u6-6HeFgYmhfWg66QFTWJxkl7vEkWujUciwUoRYgGYW1d_UjU4InmFuEtBGAMmTV-WbtTcbQ"}`

Looks like something is missing with the get...

Skarlso avatar Sep 05 '25 12:09 Skarlso

The first one is expected... :D

Skarlso avatar Sep 05 '25 13:09 Skarlso

Oh! It looks like it's just not sorted. It's not like values are missing in there.

Skarlso avatar Sep 05 '25 13:09 Skarlso

@Skarlso forgive my ignorance. The locally ran test suite passed and its functioning in my AKS cluster now. I don't fully understand the output above. What's going wrong in the e2e?

hauswio avatar Sep 05 '25 13:09 hauswio

Not entirely sure. Something must have been sorting the output before the update. 🤔

Skarlso avatar Sep 05 '25 13:09 Skarlso

@moolen I keep looking through the code, but can't find anything really. Do you think the test was just plain flaky all this time? I doubt that. We should have seen it at least once...

Skarlso avatar Sep 05 '25 13:09 Skarlso

Ah, sorry. I think I missed the whole output there.

The metadata is mismatching. The got secret contains a lot more data. Which is weird. 🤔

Skarlso avatar Sep 05 '25 13:09 Skarlso

When I added test cases did it mess up something? I followed the same factory pattern but maybe missed a detail?

hauswio avatar Sep 05 '25 13:09 hauswio

Oh sorry. This is a new test. I will need to go through that then. It looks like it's failing.

Skarlso avatar Sep 05 '25 13:09 Skarlso

2025-09-05T12:34:34.3926792Z [38;5;9m• [FAILED] [122.132 seconds][0m
2025-09-05T12:34:34.3927885Z [azure] [38;5;204m[azure, keyvault, key][0m
2025-09-05T12:34:34.3929186Z [38;5;243m/usr/src/app/e2e/suites/provider/cases/azure/azure_key.go:30[0m
2025-09-05T12:34:34.3930738Z   [38;5;9m[1m[It] should sync keyvault objects with type=key using new SDK[0m
2025-09-05T12:34:34.3932804Z   [38;5;243m/usr/src/app/e2e/suites/provider/cases/azure/azure_key.go:69[0m

Skarlso avatar Sep 05 '25 13:09 Skarlso

basically the expected secret is empty safe for the Data section I believe.

Skarlso avatar Sep 05 '25 13:09 Skarlso

I was pretty sure I used the same env values as the existing tests for the remote ref, etc. Is it an issue with the actual testing secret or how I wrote the test?

hauswio avatar Sep 05 '25 13:09 hauswio

I rechecked my test code for creating the secretstore and doing the key test. The only difference between existing and new tests is setting the useAzureSDK bool on the secret store. Still authing to yalls keyvault with all the same values and pulling the same key ref. I don't know why the test is an issue now vs prior.

hauswio avatar Sep 05 '25 14:09 hauswio