Add execution quotas
Idea
Similar to how luaerl has this has a way of specifying max reductions, it'd be neat to have that ability to limit both the max reductions (in BEAM-land) or the max lua bytecode instructions.
This would be helpful when sandboxing user code that might be poorly-behaved or which we want to more carefully timeshare.
Suggestions
- Add optional
max_reductionsormax_instructionskeys toLua.eval!'soptionsparam. - Change the return values in such cases:
{:done, ...returned values..., ...returned state...}or{:exceeded_reductions, ...returned state...}or{:exceeded_instructions, ...returned state...}
Prior art
- https://hexdocs.pm/luerl/luerl_sandbox.html
@crertel I really like this idea, however, I should point out that this is not possible to implement in this library, because Luerl implements no yielding mechanism like you're suggesting. Let's walk through the options
The process-based sandbox
The way that the luerl_sandbox works is that it starts a gen_server, then kicks your script inside a second process. The gen_server then polls the process on a timer, measuring its reductions. Once the reductions exceed the limit, it kills the process and returns. If it does not succeed, it returns the result. In other words, it either runs to completion and returns, or is preempted with no possibility of continuing.
I do not like the implementation of this sandbox, as it fundamentally changes how you interact with Luerl, moving it from the current process to some other process.
However, if we wanted to simply do the max_reductions, we could build a process based sandbox like Luerl has.
The instruction-based sandbox
In this case, Luerl could instrument the generated instructions with tracer instructions. These instructions could make use of some function call, where arbitrary tracing operations could occur, and accumulated values such as reduction count, instruction could, execution time, etc could be passed.
The tracer function could decide to either :yield or :continue, which could then be bubbled up.
The benefit of this approach is that it could lead to more interesting ways of sandboxing and open up new capabilities. The downside is that it must be implemented in Luerl and is orders of magnitude more complicated.
WDYT?
However, if we wanted to simply do the max_reductions, we could build a process based sandbox like Luerl has.
So, this is a really good point.
The benefit of this approach is that it could lead to more interesting ways of sandboxing and open up new capabilities. The downside is that it must be implemented in Luerl and is orders of magnitude more complicated.
Also very good points.
I think maybe there's two different things we're kinda feeling out:
- Sometimes, we want to say "look friend just run the code and kill it if it would take too long".
- Sometimes, we want to say "okay just keep executing this much, and then we're gonna force you to pre-emptively multitask from the execution environment. you can be as gross as you want, but we'll keep you from hogging the system."
The first one is "easy", the second is more useful but harder.
Maybe an incremental approach? Like, just the sandbox itself would be kinda handy, and then we buy Robert sufficient beer to have a stab at the other one?
The semantics also get kinda weird if, for example, the user runs Lua that calls back into Elixir or whatever and that has side-effects--the sandbox dies and doesn't really "return" anything, but maybe touched stuff outside its lifecycle. I'd argue that's just something to warn users about.
Once a function calls into Elixir or Erlang, all bets are off on pausing execution. I'm not aware of a BEAM mechanism for pausing and resuming a process from where it left off.
The process based sandbox is really easy, as you mentioned the more sophisticated version that yields back will require a lot more work in Luerl itself.
I am happy to work on those changes to Luerl, I will need to get buy in from Robert first
That's awesome! And yeah, good call on callbacks being unreliable.
This is probably off-base but I'd be curious to know if it's possible to disable certain language level constructs, like loops in this instance, to achieve the process-based sandboxing approach mentioned above.
@KelvinHu My guess there would be to have something that can traverse the Lua AST and then patch in/out/whatever as needed prior to execution?
@KelvinHu My guess there would be to have something that can traverse the Lua AST and then patch in/out/whatever as needed prior to execution?
My thought process was to error out on compilation if the lexer detects a loop. This would however bastardize Lua into another superset but would avoid the need for potentially complex AST manipulation.
You could do that, but it's a bad strategy IMO. What about recursive functions?
You got me there, I didn't think that far ahead.
FWIW if you need to aggressively sandbox, I think that time and resource based quotas are still the way to go.
I haven't had the bandwidth to implement this idea further but will look into it.
PRs welcome
I have not seen this discussion before. Using the tracing works for function calls, return and also for when the current line changes. This means that it could be used to keep track of the the amount of work done, "reductions", but there is no built-in way of stopping the system in a clean way. This could be added. Trace calls could also probably added for loops as well which could guarantee full control of limiting execution.
Just a thought.
@rvirding is there any available documentation on how to use the tracing functionality of Luerl?
@rvirding out of curiosity, is the luerl interpreter running like Lua bytecode, or interpreting statements?