osm icon indicating copy to clipboard operation
osm copied to clipboard

Integrate with OpenPolicyAgent GateKeeper for unified policy definitions

Open trstringer opened this issue 2 years ago • 14 comments

Please describe the Improvement and/or Feature Request

OPA GateKeeper is a Kubernetes implementation for policy definition and enforcement. By adding GK integration with OSM we can unblock users that are looking to have centralized policy definitions with OPA.

Scope (please mark with X where applicable)

  • New Functionality [X]
  • SMI Traffic Access Policy [X]
  • SMI Traffic Specs Policy [X]
  • SMI Traffic Split Policy [X]

Possible use cases

GateKeeper users define their policies with OPA and their common tooling. By having integration with OSM this will allow these users to also define service mesh policies.

trstringer avatar Apr 07 '22 20:04 trstringer

OSM currently has alpha support for external authorization using OPA: https://release-v1-0.docs.openservicemesh.io/docs/guides/integrations/external_auth_opa/

Is the use case here different?

shashankram avatar Apr 07 '22 21:04 shashankram

Yes, the different use case is that GateKeeper provides a Kubernetes-native layer on top of OPA that gives user a more intuitive way of working with policies. So even though we support the OPA Envoy plugin, GateKeeper is a different approach to solving this problem that has a better user experience.

trstringer avatar Apr 07 '22 22:04 trstringer

Yes, the different use case is that GateKeeper provides a Kubernetes-native layer on top of OPA that gives user a more intuitive way of working with policies. So even though we support the OPA Envoy plugin, GateKeeper is a different approach to solving this problem that has a better user experience.

Thanks for sharing. Do you think we should drop support for the existing OPA integration and use Gatekeeper as the interface to integrate OPA with OSM so users have a simple yet robust experience?

shashankram avatar Apr 07 '22 22:04 shashankram

I do think so. A unified approach would be a more direct and better experience for users that prefer OPA-based policy management.

trstringer avatar Apr 07 '22 22:04 trstringer

Here's some background information I've gathered after some research:

OPA Gatekeeper is a policy enforcement engine specifically for Kubernetes resources. It's deployed as a validating webhook and executes potential resources submitted to the API server for compliance against policy defined in the cluster (via CRDs). One of the most exciting features of OPA Gatekeeper is its ability to audit existing cluster state against defined policies, allowing cluster administrators to detect non-compliant resources and fix them before hard-failing. I think this has potential to make the upgrade story between OSM versions very smooth, and it may make sense to maintain an separate repo of best-practice gatekeeper policies that the community can use and contribute to as well.

keithmattix avatar Apr 18 '22 20:04 keithmattix

I'm wondering if there's been some confusion on native OPA vs Gatekeeper. Have the customers requests been with respect to having OSM configure Gatekeeper policies to prevent certain actions, or to allow users to perform custom traffic management policies via OPA/Envoy integration.

It's almost certainly the latter, as us using a custom validating webhook vs Gatekeeper enforcement should be opaque to the user. If that's the case, let's clarify, rescope, and rename this issue.

This doesn't necessarily preclude Keith's point, but I'd question the value of providing Gatekeeper policies, vs incorporating these policies natively in our validating webhook.

steeling avatar Apr 19 '22 14:04 steeling

My understanding is that we would want to allow OSM users to specify policies through a common policy engine (OPA) that can be delivered through the Kubernetes-native approach with GateKeeper. Put another way, all policies (including OSM) at OPA policies, implemented through GK.

trstringer avatar Apr 19 '22 14:04 trstringer

Gatekeeper is geared towards a specific type of policies, which is admission control. This is separate from configuring traffic rules or integrating with Envoy to affect the data plane.

steeling avatar Apr 19 '22 15:04 steeling

More info here https://github.com/openservicemesh/osm/issues/1874

steeling avatar Apr 19 '22 15:04 steeling

@steeling's distinction is correct; Gatekeeper's value-add is strictly admission control for business policies custom to an organization. The difference between policies enforced with gatekeeper vs our own validating webhook is that the former validates best practices and the latter validates correctness. For example, our validating webohook should absolutely allow users to create IngressBackends without mTLS. But for a highly regulated organization, that may be completely against their SecOps policy, so they want to block (potentially malicious) resources that do not comply with that policy.

I think both Envoy authz and Gatekeeper policies have value; it's just a matter of clarifying what users are asking for at the moment.

keithmattix avatar Apr 19 '22 15:04 keithmattix

IMO I see less value in Gatekeeper. Either a user can configure their own, or they want our input. If they want our input, we should use the default webhooks, since that will apply to all users, not just gatekeeper users.

If we have a set of recommended ones, maybe we add a flag, or leverage a configmap, etc, again to cover all users. Happy to discuss though! I do like the idea of furthering our integrations in the ecosystem, I just want to be careful of pigeon-holing ourselves.

steeling avatar Apr 19 '22 15:04 steeling

Do we have an agreement on whether working on this item for v1.3 or put it to vFuture for now?

allenlsy avatar Jul 11 '22 19:07 allenlsy

Added default label size/needed. Please consider re-labeling this issue appropriately.

github-actions[bot] avatar Jul 13 '22 00:07 github-actions[bot]

I've done a demo of this integration and the scenario was an OPA endpoint outside the cluster was providing an extra gate to ensure the traffic policy was in place. I could delete the OSM policy and OPA would still block the traffic. That's probably the main use case for this integration is that a separate team can ensure the integrity of the OSM policies. If the OPA rego template and OSM policy can be streamlined I see value in that, but I would say this would be beneficial in a zero trust environment for people needing this level of control. I know the benefit of GK is to not have to develop the rego templates, so if it can streamline that part that would be beneficial to a customer.

phillipgibson avatar Aug 02 '22 14:08 phillipgibson

This issue will be closed due to a long period of inactivity. If you would like this issue to remain open then please comment or update.

github-actions[bot] avatar Jan 23 '23 00:01 github-actions[bot]

Issue closed due to inactivity.

github-actions[bot] avatar Jan 31 '23 00:01 github-actions[bot]