expr icon indicating copy to clipboard operation
expr copied to clipboard

Sandboxing expression

Open alexec opened this issue 4 years ago • 17 comments

We want to be able to sandbox expression evaluation, because some users might submit time, memory, or CPU intensive expression, either accidentally, or as a intentional attack of code that uses expr, example

map(0..10000, {sprig.genPrivateKey('rsa')}

alexec avatar Oct 21 '21 20:10 alexec

There is a special "memory budget" implemented for this. See https://github.com/antonmedv/expr/blob/master/vm/vm.go#L15

antonmedv avatar Oct 21 '21 21:10 antonmedv

Will that contain CPU usage too? Will it time-bound evaluation?

alexec avatar Oct 21 '21 21:10 alexec

It's more like an expression cost estimator.

antonmedv avatar Oct 21 '21 21:10 antonmedv

But perhaps timeout of 1s would prevent much memory usage.

alexec avatar Oct 21 '21 21:10 alexec

Memory budget is better as expr doesn’t support loops.

antonmedv avatar Oct 22 '21 06:10 antonmedv

Surely map is loop?

alexec avatar Oct 22 '21 15:10 alexec

But only can map on provided arrays or 1..2 range expressions. Both are constrained.

antonmedv avatar Oct 22 '21 16:10 antonmedv

So you think time-bounding and CPU bounding are not needed?

alexec avatar Oct 22 '21 16:10 alexec

It will be cool to have. But right now expr has only query cost calculations.

antonmedv avatar Oct 22 '21 16:10 antonmedv

Can't you nest maps? If so, I feel like the loop is effectively unbounded.

map(0..10000, {map(0..10000, {map(0..10000, {sprig.genPrivateKey('rsa')}}}

crenshaw-dev avatar Oct 22 '21 16:10 crenshaw-dev

You can and expr accounts for this as well. Try in playground.

antonmedv avatar Oct 22 '21 19:10 antonmedv

Ah. So the map limit is cumulative. I can see that a million triggers the limit:

> map(0..999, {map(0..999, {"a"})})
panic: memory budget exceeded (1:14)
> map(0..998, {map(0..999, {"a"})})
[[a a a a a a a a a a a a a a a a...

But a million seems like enough to cause trouble. For example,

map(0..998, {map(0..999, {sprig.genPrivateKey('rsa')})})

If I understand correctly, this will cause 999k RSA keys to be generated and stored in memory, unless some external mechanism stops it.

Is that correct, or is there another safety mechanism I'm missing?

crenshaw-dev avatar Oct 22 '21 20:10 crenshaw-dev

I think this mechanism can be improved. We need to explain to expr what call to sprig.genPrivateKey costs, for example, 100k. So only 10 calls can be made and there’s plenty room for every thing else.

antonmedv avatar Oct 22 '21 20:10 antonmedv

Can you think about API for this and create the PR?

antonmedv avatar Oct 22 '21 20:10 antonmedv

I'm worried that memory cost estimation might be prohibitively difficult because cost might be heavily contingent on parameters.

As a simple example, sprig.genPrivateKey will generate different-length keys for different key types. Without reading every sprig function, I don't think we could describe the memory cost of the functions given certain parameters. Even if we could, I'm not sure how we would succinctly communicate to expr "this function when given these parameters will have memory cost X." (For example, how would we express, "this function, when given a triply-nested int array, will cost 1byte * avg. 3rd-level array length * avg. 2nd-level array length...")

For our use case, I think a simple timeout might be more effective. I'm assuming that, for example, a 10s timeout is short enough to prevent an OOM crash loop or a damaging CPU usage spike.

crenshaw-dev avatar Oct 22 '21 20:10 crenshaw-dev

Yes, timeout like this can also be implemented.

antonmedv avatar Oct 22 '21 20:10 antonmedv

@crenshaw-dev I think we should probably exclude expensive function from spig. by default. Or more explicitly, we only opt in. I'll create a ticket.

alexec avatar Oct 23 '21 16:10 alexec

See https://github.com/antonmedv/expr/issues/260

antonmedv avatar Oct 16 '22 20:10 antonmedv