opa icon indicating copy to clipboard operation
opa copied to clipboard

Add optimized functions for matching an array of strings with an array of prefixes/suffixes.

Open cube2222 opened this issue 1 year ago • 0 comments

What is the underlying problem you're trying to solve?

Hey! At Spacelift we integrate with OPA extensively. One of the use cases is reacting to Push events and deciding whether it's interesting for the current Stack or not.

A pattern we've been seeing our users writing very commonly, is the following:

track {
    startswith(input.push.affected_files[_], interesting_paths[_])
}

which, if both affected_files and interesting_paths are dynamic arrays, ends up being O(n^3), and is a bottleneck for many policies.

Describe the ideal solution

Add a function any_prefix_matches_any_string (name TBD) which accepts two arguments which are both arrays of strings. Then checks if any of the strings in the second argument is a prefix of any of the strings in the first argument, but does it fast.

Similarly, a any_suffix_matches_any_string (name TBD) function for suffixes.

For a test case of ours which with the naive approach takes 60ms, a draft implementation based on Patricia trees takes 0.5-1ms.

Additional Context

Very happy to contribute this.

cube2222 avatar Aug 10 '22 09:08 cube2222

@cube2222 does you implementation add couple of builtins that perform the above operations? Feel free to open a draft PR with your changes.

ashutosh-narkar avatar Aug 10 '22 15:08 ashutosh-narkar

Sure! Here's a draft implementation @ashutosh-narkar: https://github.com/open-policy-agent/opa/pull/4997

cube2222 avatar Aug 10 '22 16:08 cube2222