opa
opa copied to clipboard
Support Caching Output of Non-Deterministic Builtins
This PR implements support for caching the results of non-deterministic builtins for more complete Decision Logging support. Currently, only the rego
module and its direct dependents have received the plumbing work this feature requires.
The NDBuiltinCache is just a builtins.Cache
, which is carefully propagated downwards into the topdown.BuiltinContext
from the query/eval context. This allows non-deterministic builtins to trivially add cache entries during evaluation, which can later be recovered by higher-level library users.
Rough Usage Example (derived from the tests):
// Set up the NDBuiltinCache, and insert some arbitrary data.
ndBC := builtins.Cache{}
ndBC.Put(5, 3)
// Run a Rego query.
ctx := context.Background()
_, err := rego.New(rego.Query(query), NDBuiltinCache(ndBC)).Eval(ctx)
if err != nil {
t.Fatal(err)
}
// Get the arbitrary data back out later.
if v, ok := ndBC.Get(5); ok {
fmt.Println("k:", 5, "v:", v)
}
// ndBC will have additional cache entries from any non-deterministic builtins
// that ran during the query's evaluation.
Task List
- [x] Extend evaluator to support non-deterministic builtin result caching
- [x] Cache insertion logic for non-deterministic builtins
- :heavy_check_mark:
rand.intn
(relies on RNG) - :x: ~
io.jwt.encode_sign
,io.jwt.encode_sign_raw
(relies on RNG internally)~ (Dropped for security/scope reasons for now) - :x: ~
io.jwt.decode_verify
(relies on time internally)~ (Dropped for security/scope reasons for now) - :heavy_check_mark:
time.now_ns
(relies on time directly) - :heavy_check_mark:
http.send
(unpredictable HTTP request results) - :heavy_check_mark:
net.lookup_ip_addr
(unpredictable DNS resolution results) - :heavy_check_mark:
opa.runtime
(unpredictable config/environment-dependent results) - :heavy_check_mark:
uuid.rfc4122
(relies on RNG)
- :heavy_check_mark:
- [x] Test cases
Fixes #1514
After a discussion with @tsandall, I'm going to start over by trying to reuse the builtin cache, instead of doing a fully-custom, new caching system. I've backed up my previous changes for this feature on another fork-local branch, so the history is still there if I end up needing any of it.
There are a surprising number of panics happening in the test suite, mostly due to the NDBuiltinsCache not be initialized by default in several spots (like the WASM tests).
I think once these obvious faults are patched this PR will be RFR.
Looks like the obvious sources of Golang panic
s have been cleared up. I'll start patching the other builtins to use the NDBuiltinCache.
Looks like the WASM SDK's e2e tests are causing issues. I'll investigate. It's probably because the WASM versions of the ND builtins aren't adding themselves to the cache. :slightly_frowning_face:
Wrt JWTs — there are no security concerns around storing the result of JWT verification. Only signing would require the private key to be sent along to the logs, but those built-ins are mainly used in test code from what I've seen. I would however think the various JWT verification functions, toghether with http.send
, would be those most commonly used in real world policies.. so it would be really good if those could be included.
I decided to add some guard logic to the ND builtins, which take advantage of the fact that builtins.Cache
is a pointer to a map, and in Go, pointers can be checked to see if they're nil
. The nil checks should be pretty cheap, and allow the NDBuiltins Caching system to be an opt-in, as opposed to an opt-out system. This also means that an uninitialized NDBuiltinCache
will not result in a panic.
Currently, the WASM suite doesn't see or react to the NDBuiltinsCache feature at all. (Because the tests are "internal/golang" tests, these do not get reflected over for the WASM suite to use.) In the future, if there is demand for it, I imagine it would need to be ported over to the WASM runtime/builtins set. :thinking:
The flaky server tests are such a pain. :upside_down_face: I'll kick the GH Actions run again once this round finishes.
ND builtins now are cached entirely through logic in the evaluator. :smile: Builtins can now mark themselves as non-deterministic via a new per-builtin property (Nondeterministic
), and that property triggers the appropriate NDBuiltinCache
check in the evaluator.
The NDBuiltinCache
is also still opt-in, and should go unused by default. This will have a minimal penalty on builtin calls (one extra branch), but good branch prediction should elide the cost most of the time.
I think the commit stack has grown a bit massive, so to reduce the thread clutter from future rebases, I'm going to do a squash rebase.
Once the Github Codespaces outage resolves, we'll need to kick the CI jobs for this PR again. They should come up green, based on local checks, but you never know. :man_shrugging:
After seeing all of the CI checks go green, I've preemptively squashed my commit chain down to make it easier to merge later.
Good to merge with the extra case addressed, I'd say. Unless I'm missing something and it's not needed, but then please fill me in 😃