sops icon indicating copy to clipboard operation
sops copied to clipboard

ocikms: Oracle Cloud KMS support

Open deblasis opened this issue 2 years ago • 2 comments

Hi there, I found myself needing this so I thought it was a good opportunity to contribute a little.

Happy to make the necessary changes, I tried to comply with style as much as possible.

Fixes #981

cc: @colemickens, @davidpinhas, @b-dean

Basic Demo

https://github.com/mozilla/sops/assets/29378614/f68a7a34-95a0-4a1c-846c-c61f1a091d5a

deblasis avatar Jun 23 '23 22:06 deblasis

Is there anything else blocking this PR from being merged main for the 3.10.0 milestone release? is there an ETA on 3.10.0?

jpark0910 avatar Mar 04 '25 16:03 jpark0910

I personally won't have time to review this anytime soon (see https://github.com/getsops/sops/pull/1097#issuecomment-2607904325). I'm not sure if anyone else from the maintainers has time either, but it might well be that this won't make it into 3.10 and will get moved to the 3.11 milestone (once that exists).

felixfontein avatar Mar 05 '25 05:03 felixfontein

Any update?

masonhuemmer avatar Sep 16 '25 17:09 masonhuemmer

so it does work now, but it's painfully slow.

it took 361s to encrypt or decrypt {"foo": "bar"}. when I did the same thing with oci kms crypto encrypt cli, it was under a second.

b-dean avatar Oct 02 '25 21:10 b-dean

so it does work now, but it's painfully slow.

it took 361s to encrypt or decrypt {"foo": "bar"}. when I did the same thing with oci kms crypto encrypt cli, it was under a second.

Hi @b-dean

My fault.

I believe that OCI KMS tried Instance Principal auth even when you had valid credentials in env vars, causing the timeout hell you experienced.

Now it checks if env vars/config work first, skipping Instance Principal entirely. Same pattern Azure and GCP use.

Your operations should go from 361s to microseconds. 🤞

Since I was there, I stole other patterns from the other providers and improved testability.

deblasis avatar Oct 03 '25 06:10 deblasis

@deblasis, your latest version works much better.

I was thinking about the change you made in 4e025d8, and the composing provider is supposed to take a slice of things and then only call the method (like p.TenancyOCID()) if none of the earlier providers returned one. It'd be nice if the instance principal one was not conditional, and the early exist wasn't there either.

So .. I made this in github.com/ontariosystems/[email protected]

p := ocep.LazyConfigProvider(auth.InstancePrincipalConfigurationProvider)

so it won't call the function to create the thing unless no other provider returns a valid TenancyOCID() (or whatever method is being called).

I also added a better composing provider that actually loops through everything for AuthType(), instead of just the first thing.

all together you could have:

// newIPProvider is a variable to allow tests to stub the Instance Principal provider factory
var newIPProvider = auth.InstancePrincipalConfigurationProvider

func configurationProvider() (common.ConfigurationProvider, error) {
	var providers []common.ConfigurationProvider

	// 1) Prefer the CLI-compatible envs used widely in CI/containers (envs only; no implicit fallbacks)
	providers = append(providers, ocep.OciCliEnvironmentConfigurationProvider())

	// 2) Native SDK envs (OCI_tenancy_ocid, OCI_user_ocid, OCI_fingerprint, OCI_private_key_path, OCI_region)
	providers = append(providers, common.ConfigurationProviderEnvironmentVariables("OCI", ""))

	// 3) File-based fallbacks
	if cfg := os.Getenv(OCICLIConfigFile); cfg != "" {
		if prof := os.Getenv(OCICLIProfile); prof != "" {
			if p, err := common.ConfigurationProviderFromFileWithProfile(cfg, prof, ""); err == nil {
				providers = append(providers, p)
			}
		} else {
			if p, err := common.ConfigurationProviderFromFile(cfg, ""); err == nil {
				providers = append(providers, p)
			}
		}
	}

	// 4) Instance principals for compute instances
	providers = append(providers, ocep.LazyConfigProvider(newIPProvider))	

	// 5) Always keep a last-resort default config provider at the end
	providers = append(providers, common.DefaultConfigProvider())

	return ocep.ComposingConfigProvider(providers), nil

I do wonder if the last to should be switched. the instance principal one only works on instances, where as the default works so long as you have things correct in ~/.oci/config or whatever other default locations.

So that'd be:

  1. OCI_CLI env
  2. OCI env
  3. config files from OCI_CLI_CONFIG_FILE and OCI_CLI_PROFILE
  4. default config from sdk
  5. instance profile config

thoughts?

b-dean avatar Oct 03 '25 21:10 b-dean

Hey @b-dean! Thanks for the detailed suggestions and for adding those improvements to github.com/ontariosystems/oci-cli-env-provider v0.2.0

About this, my personal preference would be not to have the dependency at all to be honest. I added it initially but I am not a fan of it. Nothing against your library (it's solid!). Just a matter of codebase hygiene if that makes sense.

I'd like a pointer from the maintainers on this matter as they might find that a totally self contained implementation is preferable over abstractions coming from external packages whenever possible. While in doubt currently we have a hybrid approach where for now we just use ocep.OciCliEnvironmentConfigurationProvider.

https://github.com/getsops/sops/pull/1226/files#diff-a085f93310390109561334c42f9f26e612c342ded5b8318a80c71c8357f34728R28

Let's see what they say. I'd be happy to:

  1. re-implement that here and remove the dependency entirely or
  2. to offload the configuration logic completely to the external package or
  3. leave as is

My preference is on 1.

That being said, I went ahead and implemented the lazy evaluation pattern locally in a very similar fashion to your approach. I realized that the early exit was already handled by common.ComposingConfigurationProvider internally.

I did take your advice on the provider ordering as well. Runtime magic now is last resort, allowing for config overrides.

deblasis avatar Oct 04 '25 02:10 deblasis

@deblasis , that makes sense. It'd be nice if Oracle would've just put some of this in their sdk to begin with. I'm fine if you want to copy parts of the code I wrote directly in here, if the sops maintainers would rather not add more libraries.

I mainly published that because I wanted to use the OCI_CLI_* env and not have to make ~/.oci/config everywhere I'd like to use sops.

b-dean avatar Oct 04 '25 04:10 b-dean

@deblasis , that makes sense. It'd be nice if Oracle would've just put some of this in their sdk to begin with.

I mainly published that because I wanted to use the OCI_CLI_* env and not have to make ~/.oci/config everywhere I'd like to use sops.

I think the best thing you can do is to file a PR upstream to add native support for OCI_CLI_* env straight into the SDK! :)

deblasis avatar Oct 05 '25 01:10 deblasis

Test failed due to a race condition originating from the SDK code itself.

I filed a PR over there with repro steps and a fix

https://github.com/oracle/oci-go-sdk/pull/603

deblasis avatar Oct 07 '25 23:10 deblasis