wren icon indicating copy to clipboard operation
wren copied to clipboard

(enh) allow passing module to Meta.compile

Open joshgoebel opened this issue 4 years ago • 18 comments

Resolves:

// TODO: Allow passing in module?

I made this optional, but I'd be happy if I was told we could just make it mandatory - that would simplify things a lot.

Also happy to work on any needed docs/tests if this is approved for merging.

joshgoebel avatar May 15 '21 13:05 joshgoebel

As this, for security reasons, I would say no. This would allow to compromise modules contents. If the aim is to have multiple cli like evaluation context, it should be done differently.

mhermier avatar May 15 '21 15:05 mhermier

As this, for security reasons, I would say no. This would allow to compromise modules contents.

Can you explain further or give an example?

joshgoebel avatar May 15 '21 15:05 joshgoebel

This would allow to inject code in a done module:

import "meta" for Meta
import "./foo"
Meta.eval("some malicious expression", "./foo")

Because this code is also reused for cli evaluation context, it is valid (modulo some arguments). And allow to temper a loaded module.

mhermier avatar May 15 '21 15:05 mhermier

I see what you mean but using modules as a "security context" seems pretty shaky ground to begin with if we assume someone can still evaluate ANY arbitrary code in the current module with Meta.eval(_). Why do I need to tamper with ./foo at all when I can just insert whatever malicious code I want into the running scope... are you assuming only Wren user code (no C) where having direct access to a module might expose OS level functionality that someone couldn't recreate with Wren alone?

It seems likely in many cases classes would be imported (not just the module) and then the boundary would already be broken:

import "meta" for Meta
import "./foo" for Foo
Meta.eval("some malicious expression using Foo") // no module needed

I'm curious if this a security feature by design or just a side effect of how it's currently implemented. The fact that we have that TODO is the source obvious means someone was at one time willing to see this opened up further to allow code to be injected into different modules.

If the aim is to have multiple cli like evaluation context, it should be done differently.

Did you have any suggestion of how?

joshgoebel avatar May 15 '21 15:05 joshgoebel

It was done by design so that code remain simple.

The window of corruption is low, but it exist. Basically if some remote data can be exploited somehow to be injected in a Meta.eval, even when a system is live you can trigger some payload by loading an extra module that is not yet in memory, and create some unexpected behavior. EDIT: A less convoluted way is when a module also as a factory. Since one can temper a class variable, after it was tempered, the factory is compromised by producing payload objects.

Ideally, execution contexts and modules should be accessible to the users and be type differentiated (subclassing or what ever required to differentiate them). With that differentiation, we should allow different behaviors. Meta.eval should/must always run with current module. But a different API should exist for execution context to appends code, while module would take allow code in some form (source or path to source) only at construction time.

mhermier avatar May 15 '21 16:05 mhermier

In order to exploit this [as written] from the outside you'd need code like this:

Meta.eval(potentially_unsafe_user_input, potentially_unsafe_user_input)

The second argument here is just sloppy coding, no? The module name should never be specified via user input that is not very carefully sanitized. Although, I suppose can nest calls to Meta.eval... 😭🤬🤯 Ugh.

import "meta" for Meta

var y
Meta.eval("y = 32","./eval.wren")
System.print(y) // => 32
Meta.eval("import \"meta\" for Meta\nMeta.eval(\"y = 19\",\"./eval.wren\")","safe")
System.print(y) // => 19

I'm curious to see what thoughts other might have.

joshgoebel avatar May 15 '21 17:05 joshgoebel

Hmmm true. Anyway, since there is potential interpretation of user input easily leading to code execution, extra care should be taken to combat exploitation.

mhermier avatar May 15 '21 17:05 mhermier

  const char* module;
  if (wrenGetSlotType(vm, 2) != WREN_TYPE_NULL) {
    module = wrenGetSlotString(vm, 2);
    if (wrenHasModule(vm,module)) {
      wrenSetSlotString(vm, 0, "Security: Cannot compile into an existing named module.");
      wrenAbortFiber(vm, 0);
    }
  } else {
    // Look up the module surrounding the callsite. This is brittle. The -2 walks
  }

@mhermier This would resolve the immediate security issue, yes?

Just allow modules to only be created, not modified? That's all I need for my use case. Long term I think you'd want to request kind of a "handle" to a new module and then once you had that you could use it to repeatedly compile/eval code into that module...

joshgoebel avatar May 21 '21 01:05 joshgoebel

By "existing named module", you mean a module loaded using import "foo" for ... ? if so probably. That still looks like a band-aid thought. It makes me wonder, what happens if you pass a module name that is not loaded yet, does it create a module ? If so you can maybe create a payload library, before it is imported by others ?

mhermier avatar May 21 '21 06:05 mhermier

By "existing named module", you mean a module loaded using import "foo" for ... ?

Or one previously created with Meta.compile... basically kind of making them read-only once created.

It makes me wonder, what happens if you pass a module name that is not loaded yet, does it create a module ?

Well, it can't predict the future... :-) if a module hasn't been loaded then the program is free to define it - or load it. Well maybe not exactly, see below. If you have access to the FS (or virtual FS) it's already easy enough to inject many modules just be create/import, no need for Meta - so I assume you're talking more about "built-in" modules. Though those are also easy to "replace" entirely just by dropping a file into wren_modules if I'm not mistaken.

If so you can maybe create a payload library, before it is imported by others ?

I think the compiler prevents this because modules are created as they are compiled (before any code runs)? For example the below fails even though we're trying to get in front of the import: (even with false)

import "meta" for Meta
var a = Meta.compile("System.print(\"bob\")","os")
a.call()

if (false) {
import "os" for Process
System.print(Process.version)
}

joshgoebel avatar May 21 '21 13:05 joshgoebel

Oh never mind, that works ever without the import "os"... I guess that system (CLI) defined classes exist no matter what as far as "hasModule" is concerned.

joshgoebel avatar May 21 '21 13:05 joshgoebel

Ok, you could wholly replace modules resolved dynamically from wren_modules (since they don't exist until imported).

Replacing the Farm class/module with our own malicious version:

import "meta" for Meta

var bad ="""
class Farm {
  static name { "WE OWN ALL DE BASE" }
}

var a = Meta.compile(bad,"farm")
a.call()

import "farm" for Farm
System.print(Farm.name)

joshgoebel avatar May 21 '21 13:05 joshgoebel

I feel like I need a more concrete example though. eval is a dangerous thing period If you're running untrusted code naively you already have plenty of problems... perhaps in Wren Core it's a little safer but once you're in the CLI or larger runtime now you have full access to the filesystem (or worse) and could do damage - without needing to resort to hacking individual modules...

The one thing I thought of that is kind of unique situation (where ACCESS matters) - is retrieving secrets (constants?) embedded inside a module - yet that doesn't seem to be possible because that requires injecting the code into the existing module (not replacing it), which my suggestion above prevents...

joshgoebel avatar May 21 '21 14:05 joshgoebel

A file system compromise could happens but it is a symptom of the exploitation. Even with a smaller runtime you could have an infected virtual machine that only acts on memory/network as a bot net. And it doesn't need file system access.

I think the simple scenario is enabling a remote console evaluation, or running a network service (web server) that is not well protected from remote inputs.

The thing is that eval can be useful in meta programming (and I don't remember if right now it runs out of the box or need some fixes), but it is also a scary tool that could potentially be exploited a lot. So I prefer loose a little time thinking about security and be safer than really sorry.

mhermier avatar May 21 '21 14:05 mhermier

it is also a scary tool that could potentially be exploited a lot. So I prefer loose a little time thinking about security and be safer than really sorry.

Oh sure, that makes sense. And I appreciate the thought for sure. I'm going to apply the patch I pasted earlier as that tightens things up a lot. Trying to think how for my use case I could tighten things further... I'm really only using it to bootstrap the very first module - the user script that's being loaded. I suppose I could have a special foreign function that is "one use only" that once the CLI uses it to register the first module it ceases to function.

That of course doesn't solve the TODO in the code, but it would solve my use case. :-)

joshgoebel avatar May 21 '21 17:05 joshgoebel

Security is not the right concern here, IMHO (It rather involved being on the other side of this airtight hatchway). The programmer can execute code directly, no need to inject before.

eval() is evil, IMHO, not because of security - although it does indeed relate to security many times - but because it interferes with the flow of the code, and there is almost always a better way to do it. In this sense, it's like goto.

I don't see an advantage with injecting code into an existing module. I do feel a hole in the current API - the compiled code pollutes the current module. But resolving it is not by injecting the code into an other module: it can be solved by, for example, always creating a new module for the code (because it is a breaking change, we may want to create a new method for that). Or we can create a Module class, with a new() method, and pass it to Meta.compile() instead of the new name. The advantage of this way is that you can also, behind a flag, allow retrieving an existing module and inject into it (the best place for that is the mirror API in my opinion).

ChayimFriedman2 avatar May 22 '21 20:05 ChayimFriedman2

The programmer can execute code directly, no need to inject before.

Right, overall I agree, but injecting could make an attack more dangerous... instead of trying to delete random files you could say hook/replace File.open so that instead of scanning the disk (which might be detected) you could just delete or log the specific files the app is opening. So allowing a module to silently replace another one could be problematic is some circumstances... or could also be a feature... but either way it's good to know what the behavior should be.

Or we can create a Module class, with a new() method, and pass it to Meta.compile() instead of the new name

I agree this sounds useful plan overall. For my specific use case I need to set the exact name so that stack traces point to the right files, but in many cases if the name were fudged a little that might be ok for most cases... so perhaps you could even for Module.new("XMLBuilder") and get back a module name that if not exact was still readable enough to know that it's the XML builder (in a stack trace, etc).

joshgoebel avatar May 23 '21 12:05 joshgoebel

Well I would got further with some Module and Script, but it was not introduced for design decisions.

@joshgoebel I'm thinking, maybe you want something like System.module(_) @ruby0x1 mentioned if I remember well.

mhermier avatar May 23 '21 19:05 mhermier