design icon indicating copy to clipboard operation
design copied to clipboard

Documentation request: self-modifying code

Open KloudKoder opened this issue 3 years ago • 6 comments

For some reason I can't seem to find this in the docs, so apologies if I just didn't use the right search terms.

An obvious use case for WebAssembly is to implement a compiler, either for compiling human code to WebAssembly itself, or for automatically generating programs. It would be nice to have a "how to" explaining best techniques and caveats to be aware of.

My default assumption is that a file would need to be generated containing new WebAssembly, then somehow the user's browser would need to be redirected to execute it, either locally on the user's own harddrive, or remotely on the server side.

If this just isn't possible at all for some reason, then that would force any would-be compiler to become an interpreter, with all the implied performance taxes.

KloudKoder avatar Apr 23 '21 03:04 KloudKoder

Can you please elaborate a bit more, the title seems to be about self-modifying code, but then the post ends up being about implementing a compiler producing a file.

Code in a Wasm module cannot access its own representation, that cuts off some significant routes to self-modification. However it is possible to produce a module binary from another module. For example you can run Clang in the browser, and then still run its output within the same page: https://youtu.be/5N4b-rU-OAA (not without difficulty though).

penzn avatar Apr 23 '21 17:04 penzn

@penzn Thanks for the video. I watched the whole thing.

What I mean by "self-modifying code" is in the relaxed sense of being to create WASM byte code from within WASM (which is clearly possible because it's just integer manipulation) and then somehow run it as a child process (probably in an inaccessible space). The conversation with the WASM hypervisor would look kind of like this:

CLIENT: What version of WASM are you capable of running?

HYPERVISOR: 8.3

CLIENT: OK I just compiled some efficient WASM 8.3 byte code. Here are the bases and lengths of: (1) the WASM bytecode, (2) its readonly data structures, (3) the input data, and (4) space for the output data. Please run it.

HYPERVISOR: I refuse to run your code because it contains invalid byte code.

CLIENT: Sorry, my mistake. Let's try again with this corrected version.

HYPERVISOR: The code aborted because it tried to output 535 bytes, but your out buffer was only 202 byte long.

CLIENT: OK, let's try again. Here's a 535-byte output buffer.

HYPERVISOR: OK that ran successfully. The output, which is still 535 bytes long, has been saved to your output buffer. The child process has been terminated and its memory freed. However, I have retained a cached copy of compiled code and its readonly data structures in case you want to execute them again sometime.

There are then some obvious ways to parallelize this: many unique inputs to be run through a single function, or various functions paired with their respective inputs. But just a "how to" on doing the above would be super useful. Probably also including best practices, like how exactly the code should be handed off (bouncing through the virtual file system via WASI, bouncing off Javascript, etc.) Ideally, as above, it would be via nonoverlapping buffers in client memory, but I assume that's not currently supported.

KloudKoder avatar Apr 24 '21 07:04 KloudKoder

I understand your intent, though there are a few tricky bits, for example spec does not yet have a way to define a subprocess; modules can have more than one executable function, the spec does not anticipate multiple supported versions to be concurrently available, etc.

When running Wasm from JS it is not too hard to do, new module can be taken from old module's memory and instantiated. No recompilation is required while new object is alive. There might be WASI-compat or other libraries that streamline the process, but at the core there would still be memory interactions. There must be other demos of this around.

I don't think WASI has an API for this yet - it be would worth asking what their thoughts on this are.

penzn avatar Apr 27 '21 00:04 penzn

@penzn The WASI site makes it really difficult to find what POSIX functions they emulate, but in any event, I think you're right that they don't have anything like fork(). (Not that I'm asking for that; it's too heavy-handed. Just the interaction above would be sufficient.) Any suggestions on who or where I should ask about this? Here on their Git repo or elsewhere?

I'm also intrigued about your assertion that JS should be able to do this. JS is something I avoid like the plague and therefore don't know much about, but if that's the only option for the foreseeable future, then I'd like to know more if you have some recommended links.

KloudKoder avatar Apr 28 '21 15:04 KloudKoder

@KloudKoder a good place to start would be the WASI repo (inside this org). BTW, as part of the module linking proposal there might eventually be a mechanism to dynamically instantiate modules from one another. #1415 was presented last Tuesday, which would drive the transition.

penzn avatar May 01 '21 02:05 penzn

@penzn Thanks, that might be a sufficient generalization of this issue. Will comment further here when I understand better.

KloudKoder avatar May 01 '21 13:05 KloudKoder