orca
orca copied to clipboard
fix(webhook): Safer defaults and more config for webhook URLs
fix(webhook): Safer defaults and more config for webhook URLs
Exclude by default: single-word hostnames (https://orca
, https://spin-orca
), the .spinnaker
domain (a common k8s deployment namespace), common internal-name suffixes (.local
, .internal
, .localdomain
), and all verbatim IP addresses.
Add new configuration to specify a list of exclusion patterns. This greatly simplifies configuration, as it is not easy to do complex filtering in a single allow expression.
Add new configuration to dynamically exclude domains based on the values of specified environment variables. For example, this can always exclude the k8s namespace Spinnaker is currently running in, long as there is some variable set that specifies what that is. POD_NAMESPACE
is commonly set by providers, and is included by default along with ISTIO_META_MESH_ID
, as names in that domain are also resolvable.
Also allows localhost
in all cases if the rejectLocalhost
flag is false
, disregarding the name filter. This avoids the need to change the name filter to include all forms of local names while developing.
Note assuming merge... we'll want to call out this change in release notes, as it'd cause any hook to an internal call in another namespace to likely break. Can ALSO impact hooks to other private CIDRs like private VPCs on a TGW/peer... so LIKELY a breaking change for most.
as it'd cause any hook to an internal call in another namespace to likely break
Other k8s namespaces should work just fine, so long as they are not named spinnaker
, local
or internal
.
Can ALSO impact hooks to other private CIDRs like private VPCs on a TGW/peer... so LIKELY a breaking change for most.
Unfortunately, specifying private IPs verbatim in the webhook bypasses domain filtering, and could allow resolution of internal Spinnaker services. Alternatively, we could just disallow any literal IP address in the webhook configuration.
@dbyron-sf or @kskewes-sf FYI on this one...
@jasonmcintosh I've made the following changes to the implementation:
- Verbatim IP addresses are now rejected by default
- IP filtering is again considered against the resolved address
- No IP ranges are excluded by default
Trying to filter internal requests by IP is an unwinnable game of whack-a-mole. I'll give some thought to how we might recharacterize user-initiated requests in general, so that they can be rejected by internal APIs without any special configuration.
In general I support this. I'm a little hesitant because I get the feeling it changes existing behavior. Could you @jcavanagh perhaps write the release note for it so we can see how it affects users? Are we releasing versions of spinnaker fast enough to do this in two stages -- one adds the knobs, but doesn't change behavior, and the second one changes the defaults?
The description implies this, but I want to go ahead and write it out -- the idea here is that someone making calls into spinnaker itself in a webhook stage is probably doing something unsavory, and so let's forbid it by default, right?
The description implies this, but I want to go ahead and write it out -- the idea here is that someone making calls into spinnaker itself in a webhook stage is probably doing something unsavory, and so let's forbid it by default, right?
Yes, that's correct. This is not a complete fix, but does enhance the ability of operators to mitigate the most probable vector (Webhook stages) specifically. The main issues here are toning down the flexibility of k8s DNS, and disallowing exhaustive attacks within a generally-constrained IP space.
perhaps write the release note for it so we can see how it affects users? Are we releasing versions of spinnaker fast enough to do this in two stages -- one adds the knobs, but doesn't change behavior, and the second one changes the defaults?
I'm not opposed to that - we could move those env-var values to the release notes as recommendations instead.
Disallowing any IP address by default could certainly be a breaking change, but I don't think there's any real justification for such a configuration. If an operator wishes, the option can be disabled.
ORIGINALLY the plan was "ok let's break this and make it a DENY ALL by default, then REQUIRE end users to whitelist any supported webhook targets". AKA you'd want to manually add either a list or a regex. BUT make it secure default vs. not.
The current implemetnation allows calls to even internal API's which it really really shouldn't as this lets you bypass some restrictions. Link-local blocks help a TON (aka blocking ec2 metadata endpoints), but... ideally should block cluster addresses as well for these, and only allow cluster-external calls.
ORIGINALLY the plan was "ok let's break this and make it a DENY ALL by default, then REQUIRE end users to whitelist any supported webhook targets". AKA you'd want to manually add either a list or a regex. BUT make it secure default vs. not.
When is "originally"? This change was never a deny-all implementation, nor is that something that I think operators would want. I might be missing some context here.
The current implemetnation allows calls to even internal API's which it really really shouldn't as this lets you bypass some restrictions. Link-local blocks help a TON (aka blocking ec2 metadata endpoints), but... ideally should block cluster addresses as well for these, and only allow cluster-external calls.
I think you may have missed the most recent changes - see my other comment here.
https://github.com/spinnaker/orca/pull/4214 was a PR to discuss changing the IP/DNS restrictions. However, the "Break it and require overrides to be set" was backed out to prevent a breaking change. Via commit: https://github.com/spinnaker/orca/pull/4214/commits/4e25319b13570c7887392c31bcd09e3d9ef8846b#diff-6d7eea546c2be1a74db8daa4f32860c679584686c8cc7cd1c3f6fe6a199f8918R40
@mergifyio update
update
✅ Branch has been successfully updated
@mergifyio update
update
☑️ Nothing to do
- [ ]
#commits-behind > 0
[📌 update requirement] - [X]
-closed
[📌 update requirement] - [X]
-conflict
[📌 update requirement] - [X]
queue-position = -1
[📌 update requirement]