opa icon indicating copy to clipboard operation
opa copied to clipboard

union() is slower than pure Rego, can be up to O(n^3)

Open charlesdaniels opened this issue 2 years ago • 1 comments

union() is very slow in Rego because of how it is implemented currently.

Consider the implementation of the union builtin:

// builtinSetUnion returns the union of the given input sets
func builtinSetUnion(a ast.Value) (ast.Value, error) {

	inputSet, err := builtins.SetOperand(a, 1)
	if err != nil {
		return nil, err
	}

	result := ast.NewSet()

	err = inputSet.Iter(func(x *ast.Term) error {
		n, err := builtins.SetOperand(x.Value, 1)
		if err != nil {
			return err
		}
		result = result.Union(n)
		return nil
	})
	return result, err
}

And also ast.set.Union():

// Union returns the set containing all elements of s and other.
func (s *set) Union(other Set) Set {
	r := NewSet()
	s.Foreach(func(x *Term) {
		r.Add(x)
	})
	other.Foreach(func(x *Term) {
		r.Add(x)
	})
	return r
}

If we have $n$ many sets to union, of size $k$ each, then that's going to result in $n$ many calls to ast.set.Union(), except that each call to ast.set.Union() is going to end up creating a whole new set and copying all of the existing elements over to it.

This means that each call to ast.set.Union() takes increasingly more time, specifically $O(k), O(2k), O(3k), \dots$. This means the running time is O(k * nth triangle number) = $O\left( \frac{(n+1)!}{2!(n-2)!} \cdot k\right) = O\left(\frac{1}{2} k n (n^2-1)\right) \approx O(kn^3)$.

Empirical Benchmarks

With the union() builtin:

$ cat slowunion/slowunion.rego
package slowunion

SIZE := 250
NSETS := 250

nums := numbers.range(0, NSETS)
sizes := numbers.range(0, SIZE)

sets[n] = x {
	nums[n]
	x := {sprintf("%d,%d", [n, i]) | sizes[i]}
}

combined := union({s | s := sets[_]})
$ time opa eval --profile --format pretty --bundle ./slowunion/ --import "data.slowunion" 'count(data.slowunion.combined)'
63001
+------------------------------+------------+
|            METRIC            |   VALUE    |
+------------------------------+------------+
| timer_rego_load_bundles_ns   | 583584     |
| timer_rego_module_compile_ns | 730417     |
| timer_rego_module_parse_ns   | 162458     |
| timer_rego_query_compile_ns  | 52125      |
| timer_rego_query_eval_ns     | 2673904333 |
| timer_rego_query_parse_ns    | 29625      |
+------------------------------+------------+
+--------------+----------+----------+-----------------------------+
|     TIME     | NUM EVAL | NUM REDO |          LOCATION           |
+--------------+----------+----------+-----------------------------+
| 2.543266402s | 4        | 254      | slowunion/slowunion.rego:14 |
| 115.799818ms | 63503    | 126253   | slowunion/slowunion.rego:11 |
| 3.801167ms   | 2        | 2        | data.slowunion.combined     |
| 152.332µs    | 3        | 3        | slowunion/slowunion.rego:6  |
| 119.959µs    | 1        | 251      | slowunion/slowunion.rego:10 |
| 95.999µs     | 3        | 3        | slowunion/slowunion.rego:7  |
| 4.624µs      | 1        | 1        | slowunion/slowunion.rego:4  |
| 2.125µs      | 1        | 1        | slowunion/slowunion.rego:3  |
+--------------+----------+----------+-----------------------------+

real	0m2.699s
user	0m3.573s
sys	0m0.137s

With pure Rego instead:

$ cat slowunion/slowunion.rego
package slowunion

SIZE := 250
NSETS := 250

nums := numbers.range(0, NSETS)
sizes := numbers.range(0, SIZE)

sets[n] = x {
	nums[n]
	x := {sprintf("%d,%d", [n, i]) | sizes[i]}
}

#combined := union({s | s := sets[_]})
combined := {t | s := sets[_]; s[t]}
$ time opa eval --profile --format pretty --bundle ./slowunion/ --import "data.slowunion" 'count(data.slowunion.combined)'
63001
+------------------------------+-----------+
|            METRIC            |   VALUE   |
+------------------------------+-----------+
| timer_rego_load_bundles_ns   | 538083    |
| timer_rego_module_compile_ns | 725375    |
| timer_rego_module_parse_ns   | 176458    |
| timer_rego_query_compile_ns  | 49208     |
| timer_rego_query_eval_ns     | 327751125 |
| timer_rego_query_parse_ns    | 31375     |
+------------------------------+-----------+
+--------------+----------+----------+-----------------------------+
|     TIME     | NUM EVAL | NUM REDO |          LOCATION           |
+--------------+----------+----------+-----------------------------+
| 195.567117ms | 254      | 63254    | slowunion/slowunion.rego:15 |
| 114.836799ms | 63503    | 126253   | slowunion/slowunion.rego:11 |
| 2.148999ms   | 2        | 2        | data.slowunion.combined     |
| 160.375µs    | 3        | 3        | slowunion/slowunion.rego:6  |
| 117.471µs    | 1        | 251      | slowunion/slowunion.rego:10 |
| 110.126µs    | 3        | 3        | slowunion/slowunion.rego:7  |
| 10.416µs     | 1        | 1        | slowunion/slowunion.rego:4  |
| 2.084µs      | 1        | 1        | slowunion/slowunion.rego:3  |
+--------------+----------+----------+-----------------------------+

real	0m0.354s
user	0m0.426s
sys	0m0.029s

System info:

$ uname -a
Darwin styra-macbook.local 21.5.0 Darwin Kernel Version 21.5.0: Tue Apr 26 21:08:37 PDT 2022; root:xnu-8020.121.3~4/RELEASE_ARM64_T6000 arm64
$ system_profiler SPSoftwareDataType SPHardwareDataType

Software:

    System Software Overview:

      System Version: macOS 12.4 (21F79)
      Kernel Version: Darwin 21.5.0
      Boot Volume: Macintosh HD
      Boot Mode: Normal
      Computer Name: styra-macbook
      User Name: Charles Daniels (cad)
      Secure Virtual Memory: Enabled
      System Integrity Protection: Enabled
      Time since boot: 5 days 15:17

Hardware:

    Hardware Overview:

      Model Name: MacBook Pro
      Model Identifier: MacBookPro18,3
      Chip: Apple M1 Pro
      Total Number of Cores: 8 (6 performance and 2 efficiency)
      Memory: 32 GB
      System Firmware Version: 7459.121.3
      OS Loader Version: 7459.121.3
      Serial Number (system): REDACTED
      Hardware UUID: REDACTED
      Provisioning UDID: REDACTED
      Activation Lock Status: Disabled

$ opa version
Version: 0.43.0
Build Commit: d75bbdd
Build Timestamp: 2022-07-29T21:15:42Z
Build Hostname: Mac-1659129125033.local
Go Version: go1.18.4
Platform: darwin/arm64
WebAssembly: unavailable

charlesdaniels avatar Aug 08 '22 15:08 charlesdaniels

@charlesdaniels Thank you for reporting this, and for the detailed analysis/example code. :smile:

I think for now the easiest way to boost performance for this builtin would be to inline a chunk of the set unioning logic into the builtin's definition. That could get rid of a whole level of wasted allocations, since we'd not be throwing away temporary sets along the way.

philipaconrad avatar Aug 08 '22 16:08 philipaconrad