Restrict memory available to an executing kernel

Open abrown opened this issue 4 years ago • 1 comments

The current API has functions (i.e. create_buffer, read_buffer, write_buffer) that define what memory should be accessible to a parallel kernel. The current API also allows any function in a module to become a parallel function; there is no mechanism in the API for restricting the chosen kernel from attempting to access addresses of the Wasm module's memory that may be outside the passed buffers (e.g. i32.load (i32.const 0x42)). The proof-of-concept implementation makes no effort to do this kind of restriction for kernels that run on the CPU; GPU kernels are a different story. This issue is for discussing various proposals to the question: how do we restrict the memory accessible to the kernel?

Sep 09 '21 20:09 abrown

This comment by @lukewagner has been moved verbatim from https://github.com/abrown/wasi-parallel-spec/issues/4#issuecomment-912900861 as a part of moving the spec repository:

One possible direction is to use the module-linking proposal to pass the kernels into wasi-parallel as uninstantiated modules that wasi-parallel can instantiate as many times as it wants, with each instance getting its own disjoint unshared linear memory. Using the existing text format described in the module-linking explainer (which will look somewhat different in the future as module-linking is factored out of core wasm, but the concepts will be the same), a sketch of a client module would look like:

(module $Client
  (import "wasi-parallel" (module $WasiParallel
    (import "kernel" (module
      (export "kernel_run" (func ....)
    ))
    (export "run" (func ...))
  ))
  (module $Kernel1
    (func (export "kernel_run") ...)
  )
  (instance $parallel1 (instantiate $WasiParallel
    (import "kernel" (module $Kernel1))
  ))
  (module $Kernel2
    (func (export "kernel_run") ...
  ))
  (instance $parallel2 (instantiate $WasiParallel
    (import "kernel" (module $Kernel2))
  ))
  (func (export "run_all")
    (call (func $parallel1 "run") ...)
    (call (func $parallel2 "run") ...)
  )
)

The idea here is:

The imported $WasiParallel module is implemented by the host runtime.
The $Client module supplies its kernel modules ($Kernel1 and $Kernel2) to $WasiParallel using instantiate once for each kernel, producing two instances of $WasiParallel.
The $Client can trigger actual parallel execution of its kernel functions by invoking the run export of $WasiParallel on the appropriate instance.
At runtime, $WasiParallel can create as many or few instances of the given kernel module as it wants, based on the number of cores, etc. This allows each kernel to have its own unshared memory.

What's nice about this approach is, without running it, the runtime can statically identify the kernels and, if necessary, compile them specially AOT.

That being said, if you were to focus exclusively on running on many cc-NUMA cores, then this approach might be overkill and a funcref closing over a single instance with shared memory might be fine. (Technically, we need to extend core wasm to add a shared funcref and add shared to all the other kinds of instance state (viz., globals, tables) so that this sharing of a funcref is thread-safe, as described here.) In that case, all the parallel threads will be running code operating on the shared linear memory, but that might be fine or even desirable, depending on what you want.

Sep 09 '21 20:09 abrown