lua icon indicating copy to clipboard operation
lua copied to clipboard

Add execution quotas

Open crertel opened this issue 7 months ago • 14 comments

Idea

Similar to how luaerl has this has a way of specifying max reductions, it'd be neat to have that ability to limit both the max reductions (in BEAM-land) or the max lua bytecode instructions.

This would be helpful when sandboxing user code that might be poorly-behaved or which we want to more carefully timeshare.

Suggestions

  • Add optional max_reductions or max_instructions keys to Lua.eval!'s options param.
  • Change the return values in such cases: {:done, ...returned values..., ...returned state...} or {:exceeded_reductions, ...returned state...} or {:exceeded_instructions, ...returned state...}

Prior art

  • https://hexdocs.pm/luerl/luerl_sandbox.html

crertel avatar May 14 '25 15:05 crertel

@crertel I really like this idea, however, I should point out that this is not possible to implement in this library, because Luerl implements no yielding mechanism like you're suggesting. Let's walk through the options

The process-based sandbox

The way that the luerl_sandbox works is that it starts a gen_server, then kicks your script inside a second process. The gen_server then polls the process on a timer, measuring its reductions. Once the reductions exceed the limit, it kills the process and returns. If it does not succeed, it returns the result. In other words, it either runs to completion and returns, or is preempted with no possibility of continuing.

I do not like the implementation of this sandbox, as it fundamentally changes how you interact with Luerl, moving it from the current process to some other process.

However, if we wanted to simply do the max_reductions, we could build a process based sandbox like Luerl has.

The instruction-based sandbox

In this case, Luerl could instrument the generated instructions with tracer instructions. These instructions could make use of some function call, where arbitrary tracing operations could occur, and accumulated values such as reduction count, instruction could, execution time, etc could be passed.

The tracer function could decide to either :yield or :continue, which could then be bubbled up.

The benefit of this approach is that it could lead to more interesting ways of sandboxing and open up new capabilities. The downside is that it must be implemented in Luerl and is orders of magnitude more complicated.

WDYT?

davydog187 avatar May 14 '25 16:05 davydog187

However, if we wanted to simply do the max_reductions, we could build a process based sandbox like Luerl has.

So, this is a really good point.

The benefit of this approach is that it could lead to more interesting ways of sandboxing and open up new capabilities. The downside is that it must be implemented in Luerl and is orders of magnitude more complicated.

Also very good points.

I think maybe there's two different things we're kinda feeling out:

  • Sometimes, we want to say "look friend just run the code and kill it if it would take too long".
  • Sometimes, we want to say "okay just keep executing this much, and then we're gonna force you to pre-emptively multitask from the execution environment. you can be as gross as you want, but we'll keep you from hogging the system."

The first one is "easy", the second is more useful but harder.

Maybe an incremental approach? Like, just the sandbox itself would be kinda handy, and then we buy Robert sufficient beer to have a stab at the other one?

crertel avatar May 14 '25 16:05 crertel

The semantics also get kinda weird if, for example, the user runs Lua that calls back into Elixir or whatever and that has side-effects--the sandbox dies and doesn't really "return" anything, but maybe touched stuff outside its lifecycle. I'd argue that's just something to warn users about.

crertel avatar May 14 '25 16:05 crertel

Once a function calls into Elixir or Erlang, all bets are off on pausing execution. I'm not aware of a BEAM mechanism for pausing and resuming a process from where it left off.

The process based sandbox is really easy, as you mentioned the more sophisticated version that yields back will require a lot more work in Luerl itself.

I am happy to work on those changes to Luerl, I will need to get buy in from Robert first

davydog187 avatar May 19 '25 13:05 davydog187

That's awesome! And yeah, good call on callbacks being unreliable.

crertel avatar May 19 '25 17:05 crertel

This is probably off-base but I'd be curious to know if it's possible to disable certain language level constructs, like loops in this instance, to achieve the process-based sandboxing approach mentioned above.

KelvinHu avatar Jun 12 '25 07:06 KelvinHu

@KelvinHu My guess there would be to have something that can traverse the Lua AST and then patch in/out/whatever as needed prior to execution?

crertel avatar Jun 12 '25 17:06 crertel

@KelvinHu My guess there would be to have something that can traverse the Lua AST and then patch in/out/whatever as needed prior to execution?

My thought process was to error out on compilation if the lexer detects a loop. This would however bastardize Lua into another superset but would avoid the need for potentially complex AST manipulation.

KelvinHu avatar Jun 13 '25 04:06 KelvinHu

You could do that, but it's a bad strategy IMO. What about recursive functions?

davydog187 avatar Jun 13 '25 19:06 davydog187

You got me there, I didn't think that far ahead.

KelvinHu avatar Jun 13 '25 23:06 KelvinHu

FWIW if you need to aggressively sandbox, I think that time and resource based quotas are still the way to go.

I haven't had the bandwidth to implement this idea further but will look into it.

PRs welcome

davydog187 avatar Jun 14 '25 14:06 davydog187

I have not seen this discussion before. Using the tracing works for function calls, return and also for when the current line changes. This means that it could be used to keep track of the the amount of work done, "reductions", but there is no built-in way of stopping the system in a clean way. This could be added. Trace calls could also probably added for loops as well which could guarantee full control of limiting execution.

Just a thought.

rvirding avatar Jun 21 '25 22:06 rvirding

@rvirding is there any available documentation on how to use the tracing functionality of Luerl?

davydog187 avatar Jun 22 '25 15:06 davydog187

@rvirding out of curiosity, is the luerl interpreter running like Lua bytecode, or interpreting statements?

crertel avatar Jun 22 '25 22:06 crertel