Modifying large input or data is slow
What is the underlying problem you're trying to solve?
As a workaround for #5801, I have a large input document that I am modifying using with to parameterize incremental rules.
However, performance is very slow. It looks like when changing any values stored in data or input, a deep clone of the entire data or input namespace is performed. If you are evaluating a rule for every item in the input collection, you end up with an n^2 number of copies as you add elements to the list because the entire set of elements is copied for every element the rule is being invoked for.
Describe the ideal solution
Using with to mutate data or input should not require deep clones. The new state should either be a copy-on-write shallow clone of the previous state or the new state should be an overlay over the previous state.
Describe a "Good Enough" solution
As a workaround, because data and input are separate, it's possible to write values into whichever is smaller, but this trick only works once. If your policy requires modifying a value before calling a rule and both data and input already contain a large amount of data you can't just discard, you're out of luck.
If the policy author were able to create arbitrary top-level names, or if data[_] and input[_] were special such that with data.a requires no deep cloning and with data.a.b requires a deep clone of only data.a, that would probably be enough.
Additional Context
# example.rego
package example
import future.keywords
root_input contains "ok" if {
some element in input.collection
item_input with input.element as element
}
item_input contains "ok" if {
some value in input.element
value == 1
}
root_data contains "ok" if {
some element in input.collection
item_data with data.element as element
}
item_data contains "ok" if {
some value in data.element
value == 1
}
package example
import future.keywords
collection := [ { "id": i, "contents": [ j | some j in numbers.range(0, 10) ] } | some i in numbers.range(0, 10000) ]
test_root_input {
# times out
root_input with input.collection as collection
}
test_root_data {
root_data with input.collection as collection
}