opa Dependency Management

I'd like to kick off some discussion around adding explicit dependencies to Rego.

The primary use case I have in mind is sharing code. At @Fugue we have a bunch of places in our product where either internal or external users can edit and add Rego code. The interface is usually just a text editor, so we're limited to single files. This works really well but it becomes harder when you want to share code in between policies. I have a strong suspicion we're not the only ones in this position.

So what can we do to solve this?

One idea is of course building a good editor where users can edit a directory tree or multiple files. This is complicated and leads to other issues, where parts of the library of Rego code can be broken at any point, and so on.

Another alternative is allowing people to upload bundles, but this makes it less accessible for less technical users, since bundles need to be created using the CLI.

Since both of these options have significant downsides, I think there's a better way: allowing OPA to retrieve policies from external sources. Here's a strawman proposal.

We'll add a new require keyword that introduces explicit dependencies in Rego. require tells the agent that a policy requires other Rego files or bundles to be loaded.

I can imagine this could support a number of different protocols:

require "/absolute/path.rego"
require "relative/path.rego"
require "https://github.com/user/repo/v0.1.0/library.rego"
require "ipfs://..."
require "git://..."

HTTPS is probably the most useful one and would make for a good MVP. A nice addition would be to optionally specify a cryptographic hash of the dependency so we have certainty about the code we're running.

I first tried to make this part of import. However, surprisingly, it is completely orthogonal from the import keyword. I think of import more as aliasing log data.foo.bar.qux paths to just qux. You can require a bundle and import different subpackages. Or you can require some files and not import them at all. In either case, I think it's nice that we don't need to further complicate import.

Of course, OPA needs to have an option to either turn this off fully or make the cryptographic hash a requirement.

Apr 12 '21 19:04 jaspervdj-luminal

I took a stab at this last week and the results are quite promising. Here's an example w/ opa build:

A few comments:

The implementation uses the github.com/open-policy-agent/opa/refactor package to namespace dependencies. Dependencies are namespaced under the current package. For example:

package x

import "https://openpolicyagent.org/x/lib.rego" as foo

p := foo.lib.p

Assuming lib.rego is defined as:

package lib

p := 7

The namespace for lib would be data.x.foo.lib.

The implementation uses import instead of a new keyword. I went with import because it felt like most of the ast package changes to introduce require were duplicating what we do for import. The only thing I don't like about using import is that the affect for URLs is different than the affect for paths (e.g., when you import "foo" as x, the statement defines a virtual document at <package>.x unlikely when you import data.foo as x which only creates an alias inside the current file.) This inconsistency could cause confusion for new users. However, at the same time, I could imagine them being confused by two separate keywords that at first glance mean the same thing.
This is just a prototype. There are a bunch of features that could be added: caching (dependencies are currently fetched on every compile), deduplication (if the same dependency is imported in multiple places, it will be duplicated currently), hash pinning, authentication (no authentication is supported currently), parallel fetching (dependencies are fetched in serial currently), additional protocols (http only right now), etc.

Jul 06 '21 16:07 tsandall

I think implementing something like this: https://deno.land/[email protected]/linking_to_external_code/integrity_checking should remediate many concerns brought forward in the GK discussion: availability is done (vendored copy), security too (integrity checks, committed vendor dir), latency/networking.

Aug 20 '21 11:08 srenatus

Adding an external dependency directly into the policy file via import statements has its short-comings. Taking inspiration from package managers for other languages, having a dedicated file where the dependencies are defined and, more importantly, pinned to a particular version/digest goes a long way in making the process manageable in the long run.

Nov 23 '22 18:11 lcarva

ODM is a really cool side project by @johanfylling exploring this space. Early stage still, but definitely looks promising.

Jun 13 '23 19:06 anderseknert

Ancient bump!

I was talking to @johanfylling about this offline. Putting notes here.

ODM seems like a good start and something we're starting to play with. That said, for developer experience, it would be nice to see:

A central registry, something like npm, pkg.go.dev, etc, that can be viewable from a browser and collaborators can contribute to
Which can be used in Rego files, specifically with the import reserved word.
Lockfiles with versioning either sem ver or tied to SHA to start
CLI too (e.g., ODM) to manage versions and dependencies
Option to vendor dependencies (or not)

There are existing projects like Open Policy Containers [0], which approach it with OCI-compatible bundles which gets all the versioning goodness. It's already integrated with OPA, but at the agent layer I believe. This means, I can't pull down external/public policies and evaluate them directly in memory, in Go, for example. This would mean I would need to deploy a sidecar.

Ultimately, it would be nice to create the ability to view, share, and collaborate on policies. There are compliance standards and foundational infra practices (RBAC, ABAC, etc) which could benefit from reusable policies. Hashicorp with Sentinel [1] sort of has an attempt here and there's prior art in the osquery space as well [2].

[0] https://openpolicycontainers.com/ [1] https://registry.terraform.io/browse/policies [2] https://fleetdm.com/queries

Sep 15 '23 01:09 jeffchao

Ultimately, it would be nice to create the ability to view, share, and collaborate on policies.

Agreed, although I think GitHub (or GitLab, or whatever) is the better place for that than some central registry. I like the way pkg.go.dev does this (as you mentioned) where they don't really manage a registry per se, but mirror content from GitHub / Git, and present it along with docs (which for Rego could be rendered from metadata annotations) and other metadata.

Sep 19 '23 08:09 anderseknert

opa opa copied to clipboard

Dependency Management

opa
opa copied to clipboard