orca icon indicating copy to clipboard operation
orca copied to clipboard

fix(webhook): Safer defaults and more config for webhook URLs

Open jcavanagh opened this issue 2 years ago • 12 comments

fix(webhook): Safer defaults and more config for webhook URLs

Exclude by default: single-word hostnames (https://orca, https://spin-orca), the .spinnaker domain (a common k8s deployment namespace), common internal-name suffixes (.local, .internal, .localdomain), and all verbatim IP addresses.

Add new configuration to specify a list of exclusion patterns. This greatly simplifies configuration, as it is not easy to do complex filtering in a single allow expression.

Add new configuration to dynamically exclude domains based on the values of specified environment variables. For example, this can always exclude the k8s namespace Spinnaker is currently running in, long as there is some variable set that specifies what that is. POD_NAMESPACE is commonly set by providers, and is included by default along with ISTIO_META_MESH_ID, as names in that domain are also resolvable.

Also allows localhost in all cases if the rejectLocalhost flag is false, disregarding the name filter. This avoids the need to change the name filter to include all forms of local names while developing.

jcavanagh avatar Nov 11 '22 18:11 jcavanagh

Note assuming merge... we'll want to call out this change in release notes, as it'd cause any hook to an internal call in another namespace to likely break. Can ALSO impact hooks to other private CIDRs like private VPCs on a TGW/peer... so LIKELY a breaking change for most.

jasonmcintosh avatar Nov 11 '22 18:11 jasonmcintosh

as it'd cause any hook to an internal call in another namespace to likely break

Other k8s namespaces should work just fine, so long as they are not named spinnaker, local or internal.

Can ALSO impact hooks to other private CIDRs like private VPCs on a TGW/peer... so LIKELY a breaking change for most.

Unfortunately, specifying private IPs verbatim in the webhook bypasses domain filtering, and could allow resolution of internal Spinnaker services. Alternatively, we could just disallow any literal IP address in the webhook configuration.

jcavanagh avatar Nov 11 '22 19:11 jcavanagh

@dbyron-sf or @kskewes-sf FYI on this one...

jasonmcintosh avatar Nov 11 '22 20:11 jasonmcintosh

@jasonmcintosh I've made the following changes to the implementation:

  • Verbatim IP addresses are now rejected by default
  • IP filtering is again considered against the resolved address
  • No IP ranges are excluded by default

Trying to filter internal requests by IP is an unwinnable game of whack-a-mole. I'll give some thought to how we might recharacterize user-initiated requests in general, so that they can be rejected by internal APIs without any special configuration.

jcavanagh avatar Nov 11 '22 22:11 jcavanagh

In general I support this. I'm a little hesitant because I get the feeling it changes existing behavior. Could you @jcavanagh perhaps write the release note for it so we can see how it affects users? Are we releasing versions of spinnaker fast enough to do this in two stages -- one adds the knobs, but doesn't change behavior, and the second one changes the defaults?

dbyron-sf avatar Nov 11 '22 22:11 dbyron-sf

The description implies this, but I want to go ahead and write it out -- the idea here is that someone making calls into spinnaker itself in a webhook stage is probably doing something unsavory, and so let's forbid it by default, right?

dbyron-sf avatar Nov 11 '22 22:11 dbyron-sf

The description implies this, but I want to go ahead and write it out -- the idea here is that someone making calls into spinnaker itself in a webhook stage is probably doing something unsavory, and so let's forbid it by default, right?

Yes, that's correct. This is not a complete fix, but does enhance the ability of operators to mitigate the most probable vector (Webhook stages) specifically. The main issues here are toning down the flexibility of k8s DNS, and disallowing exhaustive attacks within a generally-constrained IP space.

perhaps write the release note for it so we can see how it affects users? Are we releasing versions of spinnaker fast enough to do this in two stages -- one adds the knobs, but doesn't change behavior, and the second one changes the defaults?

I'm not opposed to that - we could move those env-var values to the release notes as recommendations instead.

Disallowing any IP address by default could certainly be a breaking change, but I don't think there's any real justification for such a configuration. If an operator wishes, the option can be disabled.

jcavanagh avatar Nov 12 '22 00:11 jcavanagh

ORIGINALLY the plan was "ok let's break this and make it a DENY ALL by default, then REQUIRE end users to whitelist any supported webhook targets". AKA you'd want to manually add either a list or a regex. BUT make it secure default vs. not.

The current implemetnation allows calls to even internal API's which it really really shouldn't as this lets you bypass some restrictions. Link-local blocks help a TON (aka blocking ec2 metadata endpoints), but... ideally should block cluster addresses as well for these, and only allow cluster-external calls.

jasonmcintosh avatar Nov 14 '22 17:11 jasonmcintosh

ORIGINALLY the plan was "ok let's break this and make it a DENY ALL by default, then REQUIRE end users to whitelist any supported webhook targets". AKA you'd want to manually add either a list or a regex. BUT make it secure default vs. not.

When is "originally"? This change was never a deny-all implementation, nor is that something that I think operators would want. I might be missing some context here.

The current implemetnation allows calls to even internal API's which it really really shouldn't as this lets you bypass some restrictions. Link-local blocks help a TON (aka blocking ec2 metadata endpoints), but... ideally should block cluster addresses as well for these, and only allow cluster-external calls.

I think you may have missed the most recent changes - see my other comment here.

jcavanagh avatar Nov 14 '22 18:11 jcavanagh

https://github.com/spinnaker/orca/pull/4214 was a PR to discuss changing the IP/DNS restrictions. However, the "Break it and require overrides to be set" was backed out to prevent a breaking change. Via commit: https://github.com/spinnaker/orca/pull/4214/commits/4e25319b13570c7887392c31bcd09e3d9ef8846b#diff-6d7eea546c2be1a74db8daa4f32860c679584686c8cc7cd1c3f6fe6a199f8918R40

jasonmcintosh avatar Nov 15 '22 22:11 jasonmcintosh

@mergifyio update

dbyron-sf avatar Dec 13 '22 22:12 dbyron-sf

update

✅ Branch has been successfully updated

mergify[bot] avatar Dec 13 '22 22:12 mergify[bot]

@mergifyio update

dbyron-sf avatar Sep 27 '24 20:09 dbyron-sf

update

☑️ Nothing to do

  • [ ] #commits-behind > 0 [📌 update requirement]
  • [X] -closed [📌 update requirement]
  • [X] -conflict [📌 update requirement]
  • [X] queue-position = -1 [📌 update requirement]

mergify[bot] avatar Sep 27 '24 20:09 mergify[bot]