gatekeeper-library icon indicating copy to clipboard operation
gatekeeper-library copied to clipboard

Allow reuse of rego snippets

Open grosser opened this issue 4 years ago • 8 comments

we have this and some other chunks copied in 5+ policies and idea how to clean this up/make reuse work (except by using a new layer of templating) some shared libraries or defining and then calling out to go libraries would be great

        # pods
        pod_template() = pod {
          input.review.object.kind == "Pod"
          pod := input.review.object
        }
        # statefulsets, deployment, daemonsets, jobs
        pod_template() = pod {
          input.review.object.spec.template.spec.containers[0]
          pod := input.review.object.spec.template
        }
        # cronjobs
        pod_template() = pod {
          input.review.object.spec.jobTemplate.spec.template.spec.containers[0]
          pod := input.review.object.spec.jobTemplate.spec.template
        }

grosser avatar Jun 23 '20 01:06 grosser

+1 to supporting better code reuse.

We have the beginnings of that in the libs field of the constraint template. That is intended so that code may be safely shared without worrying about naming collisions.

There is definitely a lot of tooling work to be done around automatically bundling common libraries into constraint templates as part of a build process.

maxsmythe avatar Jun 23 '20 07:06 maxsmythe

Only usage of libs I could find is here is https://github.com/open-policy-agent/gatekeeper/blob/1312aee9bc3e772003f5113ce4e454be7367e8d7/test/bats/tests/templates/k8scontainterlimits_template.yaml

  • Docs here on how libs work would be nice
  • I could not find anything in the rego docs on import, running opa test ignores import (does not fail / does not do anything).
  • Since I'm not keeping my rego code in yaml files, I'd need to make it also work when using opa test cli too.

... so ideally import would work locally by loading in a relative file and when I generate my constraints I either generate them into libs or dump helpers and code into the rego field.

grosser avatar Jun 26 '20 03:06 grosser

Found some example setup here https://github.com/plexsystems/konstraint/tree/main/examples

... so I think the idea is that:

  • during local operation all files get loaded -> all packages are defined
  • when generating the constraints the libs get passed in via an extra field so they are available too

grosser avatar Jun 27 '20 00:06 grosser

@grosser, yep that's the idea currently. Anything slated for potential re-use would go into libs and the rego would then import it.

But as @maxsmythe points of, still going back and forth on the best way to more systematically import what you need, rather than import the world every time.

jpreese avatar Jul 01 '20 21:07 jpreese

FYI I ended up making small libs and then generating the policy by scanning for which imports are needed

      libs = rego.scan(/^import data\.lib\.([^.\s]+)/).flatten(1).map do |lib|
        File.read("policies/lib/#{lib}.rego")
      end

grosser avatar Jul 02 '20 01:07 grosser

@grosser Konstraint does that as well. For every rego, it'll look at the import statements and add the imports to the lib section.

jpreese avatar Jul 02 '20 01:07 jpreese

I'd love to see this potentially be extended into either loading common libs from ConfigMaps as is kinda-sorta the case for vanilla OPA, or else see a CRD created for libs. If I define a lib in one ConstraintTemplate, can it be referenced by name from another ConstraintTemplate?

rbkaspr avatar Feb 18 '21 11:02 rbkaspr

I'd love to see this potentially be extended into either loading common libs from ConfigMaps as is kinda-sorta the case for vanilla OPA, or else see a CRD created for libs. If I define a lib in one ConstraintTemplate, can it be referenced by name from another ConstraintTemplate?

Not currently. Gatekeeper currently requires each template be self-contained. The benefit of that is that you avoid dependency conflicts.

Imagine one template relied on a function:

less_than(a, b)

And another template relied on a newer version of that function that allowed fuzzy matching:

less_than(a, b, delta)

The example is a bit contrived, but there are plenty of examples of dependency conflicts in the wild

libs is a way to make it easier to embed hard-copies of a dependency into a template so that template consumers don't need to worry about conflicts happening live on their cluster.

There is also a potential security issue where one template would be able to maliciously extend another template's library if they had access to a shared namespace.

maxsmythe avatar Feb 20 '21 06:02 maxsmythe

This issue/PR has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jan 31 '23 23:01 stale[bot]

this is similar to #206

maxsmythe avatar Feb 01 '23 02:02 maxsmythe

It seems like this issue should be on the main Gatekeeper project, no?

Regardless, adding libs to existing templates using kustomize or scripting is one thing, but doesn't seem like close to an ideal solution as it involves grossly expanding the size of the objects with lots of duplicate code. It seems like some kind of new CRD Library is needed to house common code that can be imported into arbitrary templates. For example, a template could declare its Library dependencies using label selectors or by name, and the controller makes those files available.

As for conflicts, it seems like each template should have its own scope (otherwise shouldn't those conflicts already exist? Say two templates declaring a rule with the same name). So if a template tries to extend the library, it doesn't influence any other template

The main issue I see (knowing nothing about the internals, really) is integration with gator. Locally, if I declare a dependency, gator needs to know how to load that dependency without talking to a cluster. Solvable, I think, by having a way to configure it to "gather" the relevant libraries locally from the repo so you can simulate the import clientside

jcmcken avatar Mar 23 '23 15:03 jcmcken

It seems like this issue should be on the main Gatekeeper project, no?

I think this is about the ability to re-use Rego in the writing process, not necessarily avoiding duplicating compiled code.

doesn't seem like close to an ideal solution as it involves grossly expanding the size of the objects with lots of duplicate code.

There is definitely a tradeoff where we are gaining a lack of need to worry about dependency conflicts (and security/speed as I'll get into in a second) for space. So far I haven't heard any complaints about memory footprint due to copied Rego, so definitely curious if that's a problem. By far the largest consumer of RAM is usually the informer cache for synced resources (at least on large clusters).

WRT the speed gain: each constraint template is compiled into a separate Rego environment, so they are unable to share code. The security/speed benefits of this relative to compiling everything to the same Rego environment (which we used to do) are:

  • When adding a new template, the entire corpus of Rego no longer needs to be recompiled, which was expensive

  • There is no need to sandbox templates from each other by statically analyzing the Rego to prevent over-broad access, which makes it easier to ensure templates cannot affect each other in multi-tenant environments.

  • We are able to offload common operations, like evaluating match criteria, to pure golang, which tends to be more performant (there are maybe ways to do this while still keeping everything in the same Rego environment, but coupled with the benefits to compilation time, this was a win)

See this doc for more info about the performance improvements.

It seems like some kind of new CRD Library is needed to house common code that can be imported into arbitrary templates. For example, a template could declare its Library dependencies using label selectors or by name, and the controller makes those files available.

Aside from raising the need to resolve dependency conflicts (maybe solvable if you, say, require the users to reference their dependency via semver), there would also be a usability concern here. How would these common libraries be packaged and redistributed?

As for conflicts, it seems like each template should have its own scope (otherwise shouldn't those conflicts already exist? Say two templates declaring a rule with the same name). So if a template tries to extend the library, it doesn't influence any other template

Currently each template has its own scope for its local code, but that doesn't change the problem of dependency conflicts, since we are talking about shared libraries, which implies each template would be referencing the same rego code. This is the same sort of problem as static vs. dynamic linking of libraries.

The "maliciously extend" mentioned in an above comment was less about whether a template has its own scope and more about if they are also importing a common scope (e.g. data.libs.common). In the case where they are and data.libs.common is in any way writeable (say by adding two libraries that have the same package path), then there is opportunity for exploitation. Preventing exploitation may be possible, but avoiding shared code altogether seems a stronger guarantee.

The main issue I see (knowing nothing about the internals, really) is integration with gator. Locally, if I declare a dependency, gator needs to know how to load that dependency without talking to a cluster. Solvable, I think, by having a way to configure it to "gather" the relevant libraries locally from the repo so you can simulate the import clientside

I think, at that point, the libraries could live next to, and be loaded with, the constraint templates (using the same methods a user is leveraging to load the templates into gator).

Finally, there is the elephant-in-the-room that is the CEL KEP. When dealing with language-specific features, we should be careful to remain generic so that we don't accidentally bake something into the system that makes it hard to adapt to the changing landscape. The use of common libraries, if adopted, would be one of those things that should be generic.

There are more pedestrian concerns such as code complexity (e.g. dealing with the edge cases of a multi-resource controller), that are surmountable, but definitely not trivial.

In summary, I don't think it's impossible to have shared libraries, but I'd definitely want to see that there is a real need, that the benefits outweigh the costs and get a sense of how the CEL KEP reshapes the environment before treading too deep into a design.

maxsmythe avatar Mar 24 '23 01:03 maxsmythe

Just for more background, we use a COTS product called Styra as a replacement for Gatekeeper. It's the main feature missing in Gatekeeper that we use heavily in Styra (in our case, we have ~1000 lines of Rego code which includes all the library code plus the business logic, so you can see the benefit of abstracting rules into a library). Now granted, Styra doesn't store policy as CRDs in the cluster. So there's a slightly different architecture and different concerns. But that's where I'm coming from

jcmcken avatar Apr 04 '23 14:04 jcmcken

if duplicating them in CRDs is not getting too large you should be able to do that

grosser avatar Apr 04 '23 14:04 grosser

It's the main feature missing in Gatekeeper that we use heavily in Styra (in our case, we have ~1000 lines of Rego code which includes all the library code plus the business logic, so you can see the benefit of abstracting rules into a library).

I'm curious if all policies use all library code, or if selective importing would limit the duplication? How many of those 1000 lines are library code?

Also, in terms of RAM usage, I'm curious how many constraint templates you'd be looking to use?

Assuming an average of 100 characters/line, 1000 lines is around 98 kB (though unsure what the compiled size would be, worth looking into), which would be require ~100 templates for 10 MB of RAM usage for the string representation. Let's assume 50 MB of RAM to account for duplicate caching and size of compiled code (it could be much better or worse than that, I haven't benchmarked shared Rego as TBH the template library has only a few examples that are much smaller than this).

These are all obviously very rough rules-of-thumb, but I'm interested in figuring out when the management complexity vs. infrastructure spend cost/benefit analysis starts to tip for most people.

At a macro level, some amount of code duplication is implicit in most containerized/sandboxed applications. Docker containers share a kernel, but have differing binaries/libraries/runtimes, for example. IMO it's the same issue here: what are the tradeoffs between operational complexity, operational fragility, and infrastructure cost?

I'm also curious... it seems like you're looking to use multiple templates, which implies you see value in sharding your existing code across templates vs. creating one mega-template. I can make guesses, but don't what to assume: what benefits are you looking for in switching to constraints/templates?

maxsmythe avatar Apr 05 '23 02:04 maxsmythe

if duplicating them in CRDs is not getting too large you should be able to do that

That's what we'll do for now, perhaps using something like Helm templates to generate everything. We have a little bit of work to do to migrate everything though so I haven't been able to test it yet to see if we hit any limitations

Also, in terms of RAM usage, I'm curious how many constraint templates you'd be looking to use?

That's a question I had as well. We have lots of use cases, not only for native Kubernetes objects, but lots of CRDs for different commercial and other products we use. Essentially, we use OPA to implement more secure multitenancy in our clusters where the controllers are lacking (typical examples are preventing cross-namespace references, validating schema where the CRDs are lacking, enforcing certain settings within object specs, deconflicting global settings like ingress hostnames, etc.)

I'm curious if all policies use all library code, or if selective importing would limit the duplication? How many of those 1000 lines are library code?

They almost all use some kind of library code. But I don't know enough yet to say how much duplication we'll need to introduce. For example, a common complexity is objects that have similar shapes. Think all objects that include Pod somewhere in their data -- Job, Deployment, StatefulSet, CronJob. We use some products that have similar characteristics where we need multiple rules to select the different varieties -- object A has the data at spec.x.y.z, object B has it at spec.a[].b[].c, etc. -- in some cases the nesting goes very deep and there are many different varieties. Some of these products also nest things like Pod or other native objects we want to govern.

what benefits are you looking for in switching to constraints/templates?

To be honest, mainly we would like to stop paying for Styra. We really only use it for Kubernetes policy and not any of its other features

jcmcken avatar Apr 07 '23 17:04 jcmcken

Think all objects that include Pod somewhere in their data -- Job, Deployment, StatefulSet, CronJob. We use some products that have similar characteristics where we need multiple rules to select the different varieties -- object A has the data at spec.x.y.z, object B has it at spec.a[].b[].c, etc

For the general model of "resources that wrap other resources", especially if those resources are created in the cluster, expansion templates may be helpful, letting you only worry about writing policies against pods. They also play a bit nicer with match criteria, which may improve the enforcement accuracy.

Definitely curious how things go. Keep us updated!

maxsmythe avatar Apr 13 '23 22:04 maxsmythe

@maxsmythe Thanks, I'll look into that. Not sure we're at a Gatekeeper version that has that alpha feature, but I suppose we can upgrade to try it out

jcmcken avatar Apr 17 '23 15:04 jcmcken

This issue/PR has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jun 16 '23 21:06 stale[bot]