helix icon indicating copy to clipboard operation
helix copied to clipboard

WebAssembly plugins system

Open CBenoit opened this issue 3 years ago • 193 comments

Basically load .wasm files.

Capabilities:

  • provide new commands,
  • hook on events,
  • call helix commands,
  • access UI stuff…?

At first we could use a basic toml config file or CLI to feed .wasm files to the editor.

A way to configure permissions on a plugin basis could be investigated to use the sandboxing capabilities coming with WASM. Example with wasmtime:

$ wasmtime --dir=. --dir=/tmp demo.wasm [args…]

(reference)

I think the biggest challenge is to get well-defined interfaces down but let's not fear to break it during early stages.

Later we could investigate embedding a wasm-based scripting language such as Grain or AssemblyScript.

Here are some references:

I'm willing to experiment soon

CBenoit avatar Jun 05 '21 04:06 CBenoit

I think the most basic demo would be implementing one of the existing commands in such a way, e.g. https://github.com/helix-editor/helix/blob/407b37c3279bfd0ae2bf756bc022d47d5db446d9/helix-term/src/commands.rs#L123-L137 It only depends on the Context, and it only uses functions from core (move_horizontally) without interacting with the UI. I think most of the primitives in core should be exposed (selection/range/transaction/etc) and we can have a subset of view exposed (retrieving a selection from a doc, applying transactions)

archseer avatar Jun 05 '21 04:06 archseer

Mentioned on Matrix:

Re: wasm plugins, rich clipboard support would be a great demo and to that end, https://github.com/neovim/neovim/issues/14706 "[RFC] Neoclip: multi-platform clipboard provider w/o extra dependencies" may be worth watching.

archseer avatar Jun 05 '21 09:06 archseer

What benefit does this provide over binary plugins? Will plugins be able to access eachother's state (functions, variables etc.) in the same way that plugins programmed in a single language (e.g. emacs) are able to, or something else?

My bad if the answer is obvious btw. I'm not very familiar with WebAssembly or plugin architectures.

buster-blue avatar Jun 07 '21 21:06 buster-blue

What benefit does this provide over binary plugins? Will plugins be able to access eachother's state (functions, variables etc.) in the same way that plugins programmed in a single language (e.g. emacs) are able to, or something else?

My bad if the answer is obvious btw. I'm not very familiar with WebAssembly or plugin architectures.

I think this post explains it well. There's a lot of uncharted ground in regards to plugin systems, especially with sharing state, but you can see past discussion on this here.

kirawi avatar Jun 08 '21 01:06 kirawi

@kirawi Thank you for the link to Valoren, I missed that one

CBenoit avatar Jun 08 '21 03:06 CBenoit

If you want to embed the wasm compiler with the binary or as library etc, I would advise against wasm due to slower speed and missing performance optimisations on many targets.

"Of course, wasm3 runs fine on aarch64, but at about 3% of native speed."

This benchmark shows very astonishingly the missing optimisations that luajit can do here using this lua libary.

Also it would be advisable to have execution time benchmarks of webassembly against all usable alternatives before making a decision (ie by listing some speed and platform compatibility requirements). Would be not very useful to end up with something significantly slower than neovim.

matu3ba avatar Jun 11 '21 19:06 matu3ba

If you want to embed the wasm compiler with the binary or as library etc, I would advise against wasm due to slower speed and missing performance optimisations on many targets.

"Of course, wasm3 runs fine on aarch64, but at about 3% of native speed."

This benchmark shows very astonishingly the missing optimisations that luajit can do here using this lua libary.

Also it would be advisable to have execution time benchmarks of webassembly against all usable alternatives before making a decision (ie by listing some speed and platform compatibility requirements).

The intention is to use wasmer or wasmtime which are both written in Rust, and are quite fast in the benchmarks. However, real-world use is not going to be accurately represented by benchmarks. If it wasn't fast, it wouldn't be used by Google or Figma to power some of their most resource-intensive client-side services, nor would it be able to power game engines on the web. (Here is a more complete list.)

Lua lacks multithreading, SIMD, and has a smaller ecosystem. It's a lot more convenient to enable users to opt for their favourite language rather than have them learn a separate language and its ecosystem. I can't speak on behalf of everyone, but I believe that most Rustaceans firmly agree with me, as many libraries have wasm as a compile target. I would say we collectively have a lot more experience with wasm as a whole.

kirawi avatar Jun 11 '21 19:06 kirawi

We'll probably go with wasmtime and I can see in one of the benchmark you linked that performances are far from bad. By the way wasmtime has been supporting WASI for a while prior to the article publication date, but they don't mention it, so I assume they used old data from a previous benchmark,

Older versions of the following runtimes had been tested in the previous rounds: […]

Given that Cranelift code generator is still very young and already that good, I'm not too afraid. Furthermore performance tracking is a thing at the bytecodealliance.

Besides all the other good points made by @kirawi on performance, I really like the ecosystem argument which is the reason why we want to use WASM in the first place. Given that performance is probably good enough (or more than enough), the main focus should be on whether user/developer experience is good.

CBenoit avatar Jun 11 '21 21:06 CBenoit

Thanks for considering Wasmer! We are working on a set of benchmarks at the moment, we had two bugs that prevented Wasmer to shine in the one you showcased @CBenoit, but those are already solved in master.

Right now Wasmer is about 20~30% faster than the other runtimes when using the LLVM compiler!

In general, we recommend Cranelift if the compilation times need to be fast (that is mainly for development), and LLVM for production (similar to how Rust uses LLVM always to compile in release mode, and Cranelift as an experimental compiler for faster compilation times in debug mode) :)

syrusakbary avatar Jun 11 '21 21:06 syrusakbary

Thanks for considering Wasmer! We are working on a set of benchmarks at the moment, we had two bugs that prevented Wasmer to shine in the one you showcased @CBenoit, but those are already solved in master.

Right now Wasmer is about 20~30% faster than the other runtimes when using the LLVM compiler!

In general, we recommend Cranelift if the compilation times need to be fast (that is mainly for development), and LLVM for production (similar to how Rust uses LLVM always to compile in release mode, and Cranelift as an experimental compiler for faster compilation times in debug mode) :)

This is a question I've had for a while, but what is the main difference between wasmtime and wasmer? I can't recall exactly, but I believe they're targeting different use cases, correct?

kirawi avatar Jun 12 '21 20:06 kirawi

Thank you @syrusakbary for your input!

This is a question I've had for a while, but what is the main difference between wasmtime and wasmer? I can't recall exactly, but I believe they're targeting different use cases, correct?

I'm interested in hearing more as well!

I was more in favor of wasmtime for the following reasons:

  • backed by the @bytecodealliance
  • @alexcrichton and other notable names from the Rust compiler and std lib team are working on it
  • a single back-end is supported (Cranelift) so we can expect it to be well supported and maintained while supporting three back-ends (Singlepass, Cranelift and LLVM) sounds like a heavy task

On the other hand, wasmer:

  • including dependencies, is less lines of code (on lib.rs: ~5.5–8.5MB for ~177K SLoC vs wasmtime with ~18MB for 417K SLoC), meaning we probably get faster compile time and a smaller hx binary without even using lto (we would need to actually see what we get in practice though),
  • is used by Veloren
  • is used by zellij
  • (has Japanese documentation :stuck_out_tongue: )

I'll also ask on Veloren's Discord why they decided to use wasmer to form a better opinion.

I also found this article published a few months ago:

  • Wasmer has the best overall support compatibility with every programming language at super-speed
  • Wasmtime is lightning-fast and compact, with good configurability but fewer languages supported
  • Lucet is a specialized solution for running untrusted WebAssembly programs inside a larger application
  • WAMR runs with a small footprint

CBenoit avatar Jun 13 '21 01:06 CBenoit

@CBenoit @kirawi There are 2 mutual exclusive possible use cases for WASM for plugins, where one really wants to have a plugin manager: 1. shipped compiler or 2. hook into the repo build system and the used compiler to build the WASM libraries/binaries.

  1. interpreted which one usually wants to ship a dedicated compiler (which is language-specific for performance => This is effectively a soft language-lockin as one wants to ship it with the editor and not bundle all compilers/programs that can emit WASM.

  2. compiled (fastest) which can be completely independent, but requires support of 1. the build system and 2. distribution package managers etc to have the compilers installed. If the language does not support easy build system integration and there is no convention where to put stuff upstream, it does scale poorly (because the package manager of helix can not build stuff).

Solution 1. gives more of a "it always works" experience with the language that the compiler supports

Solution 2. gives more of a "its super fast", but needs some setup and might break. Unless you can use nix for 2. ie with flakes 2 is just painful due to 2.1 many distros not shipping all compilers or only specific versions and then you are stuck on one (ie not yet stable languages) and 2.2 you want to simplify build system stuff for plugin developers.

nix with flakes is the de-facto best solution for 2.1 (shipping compilers+libc(s)? and make sure they exist, are configured/build identically and up-to date enough for the plugin ecosystem). For 2.2 nix flakes should work, but a convention how build commands/shell scripts are supposed to be called could also work (I am not sure how stable flakes are for that use case for all the build systems that exist).

Ask yourself which solution is better for the goals of the project helix or how you want to deal with the mess of 1. shipping compilers and 2. supporting the build systems of plugins.

matu3ba avatar Jun 13 '21 19:06 matu3ba

We won't be shipping a compiler, though? WASM is a compile target for Rust itself as wasm32-unknown-unknown, and other compilation methods use their own external tool (Wasmer and Wasmtime also provide their own) to compile it down to .wasm, which would be run by the WASM runtime. Languages like Python or Lua currently work by compiling their interpreters down to .wasm, but in the future they could be directly compiled down to .wasm once the WASM GC is standardized and implemented.

Did I understand you correctly?

kirawi avatar Jun 13 '21 19:06 kirawi

@kirawi How does this change the problem, when the plugin is written in a language that the distribution doesnt ship a compiler (or super old/incompatible or slow libc etc) ? The plugin then just doesnt work? Or works differently and the plugin author should deal with the mess of different compiler versions?

matu3ba avatar Jun 13 '21 19:06 matu3ba

No, it's just .wasm. The user doesn't need to download anything else other than that standalone .wasm output which is what the plugin actually is. Think of it as creating an executable, you don't need all the code and compiler stack to run the executable as an end-user, because the OS already has everything it needs to run the executable inside the executable itself. The executable in this case is .wasm, and the OS is the Wasm runtime. Wasm is completely language agnostic and is just bytecode.

kirawi avatar Jun 13 '21 19:06 kirawi

@kirawi Thanks alot for the clarification.

How is 1.integrity (nobody tampered with the build process/environment), 2. debuggability/tracebility (what stuff got wrong) and related 3. uniformity (potential contributors can reproduce the .wasm to manually reproduce/fix stuff)?

Point 1. is a stopper for usage in any more security-critical developer environments, point 2. is okayish not to have (debugging luajit also doesnt work very good), point 3. can be super annoying to find contributors.

matu3ba avatar Jun 13 '21 19:06 matu3ba

  1. Couldn't you argue that for any software? If you're worried about it, you can compile the plugin yourself.
  2. Wasmtime and Wasmer.
  3. The potential contributors would need to know the code, but that's the same for any language.

The same challenges exist for any language, but Wasm is a compile target, not a language, allowing for anyone to write any plugin in any language they want, instead of restricting them to one.

kirawi avatar Jun 13 '21 19:06 kirawi

  1. I strongly disagree on 1 reproducable builds, since this gives to much trust to single developers to do potential harm while making it hard to detect it. You can not reproduce a binary/wasm file without an exact identical build environment. Also you may have to deal with third parties shipping binaries etc.
  2. thanks alot.
  3. Conventions and build systems influence how code is structured, but that is overall correct.

matu3ba avatar Jun 13 '21 20:06 matu3ba

Indeed, but that is something that is just going to happen no matter what. Every program that people use is going to be written in separate languages, and if you want to do that of risk analysis, you would want to compile it yourself. But the other benefit is that since you can write a plugin in your preferred language, you can for the most part write all your plugins in-house. Lua is arguably worse because it lacks sandboxing, while with Helix and Wasm we would have full control over what a plugin can or cannot do.

It's a tradeoff for sure, but I think Wasm opens up a lot of possibilities to circumvent the cons of using Wasm.

kirawi avatar Jun 13 '21 20:06 kirawi

@kirawi How much performance loss does sandboxing mean?

L3 cache can likely never fully be mitigated without notable performance loss, since it is shared between cores (and mitigation would require accurate time synchronisation between memory controllers of cores).

Here is the list of possible cache attacks which looks scary to me. And on top of that: "most existing hardware is broken in terms of security! And the underlying cause is the ISA, which is our hardware-software contract." risc5 timing channel analysis paper should be linked in the article => flushing might not flush cache immediately.

matu3ba avatar Jun 13 '21 20:06 matu3ba

Wasmer and Wasmtime are both sandboxed, so you can reference the earlier benchmarks for them. Admittedly I'm not sure on the cache aspect, and I'm aware that Wasm sandboxing is far from perfect, but I think that leads back to the point that you would need to evaluate the plugin if you're untrusting of the plugin. I also found this article, maybe it'll shed some light.

Just being on the browser right now, you're probably running quite a lot of JavaScript and Wasm already. I think this discussion, though important in general, is bike shedding in this context. There isn't a perfect solution here, and there probably never will be, but Wasm seems like the best option for Helix.

kirawi avatar Jun 13 '21 20:06 kirawi

Can we have two in one option? On one hand we let developers have it compiled so it could be faster. On the other hand we can have it interpreted to allow testing stuff faster. We could use the interpreted version as one of the compiled plugin. Not sure how feasible is this.

pickfire avatar Jun 14 '21 15:06 pickfire

That's definitely some workflow we could think of at some point. Makes me think of the gcc-emacs that is compiling elisp instead of interpreting it IIRC.

I prefer investigating the embedded runtime approach fully before we consider that though.

CBenoit avatar Jun 14 '21 16:06 CBenoit

Just for clarification. Wasm3 is 4-5 times faster than, say, Python 3.9, yet weights only ~150KB. It doesn't require any JITting, etc. so runs exactly the same on iOS, Android, Windows and other platforms.

On Apple M1 for instance, it showed ~5-7% of native speed, which is somewhat better than 3% that was claimed here. But this wasn't heavily tested/optimized anyway.

vshymanskyy avatar Jun 14 '21 21:06 vshymanskyy

Today in the news: https://wasmer.io/posts/wasmer-2.0

Seems to have some support for reference types. I'd still like to see a comparison with wasmtime beyond just performance.

(Edit: Nice, I see the announcement is written by @syrusakbary himself)

archseer avatar Jun 17 '21 12:06 archseer

Hey all👋🏻

I work on Wasmtime (I manage the Fastly team doing much of the development work), and represent the Bytecode Alliance, so I don't think it'd be appropriate to chime in with a direct comparison between the two projects. Instead, I want to offer conversations to help answer questions you might have—feel free to shoot me emails ([my gh handle]@fastly.com) or a DM on the Bytecode Alliance Zulip.

A few pieces of information about Wasmtime that might be relevant to your evaluation:

  • We recently substantially overhauled our embedding API to make it provide excellent usability and safety in particular in async runtime environments, but also improving single-threaded, sync-API embeddings.
  • Based on the description in the OP, it looks like our witx-bindgen toolchain could provide an excellent basis for defining and exposing plugin APIs.
  • We're fully committed to having a pure-Rust solution based on Cranelift, and are putting substantial efforts into ensuring the safety of our implementation.
  • Wasmtime is in active use in plugin system-like environments such as Kubewarden

tschneidereit avatar Jun 17 '21 14:06 tschneidereit

Another idea: Do all configuration and plugins prior to compiling similar to how the suckless people do it. The end-user would clone a repo/crate that depends on the helix crate and acts as a configuration template. The editor would be configured by filling in some structs and passing a Config object to an initialization procedure from the helix crate. Similarly, plugins would just be crates each exporting some kind of plugin definition, which would alse be passed to the initialization procedure of the helix crate.

Pros

  • Native code, no WebAssembly overhead
  • Does not depend on a C ABI, unlike approaches that involve loading shared libraries
  • Configuration language is just Rust itself, which is much more powerful than e.g. TOML (think: color manipulation for themes etc.)
  • We get a plugin distribution/versioning infrastructure for free (crates.io)

Cons

  • End user would have to have either Rust/Cargo or Nix installed
  • Config changes would require recompilation

malte-v avatar Jun 19 '21 11:06 malte-v

Another idea: Do all configuration and plugins prior to compiling similar to how the suckless people do it.

Some concerns about this approach:

  • Wasm enables users to write plugins in any language that that can be compiled down to it:

    This is arguably very important in nurturing a good plugin ecosystem. Let's take neovim as an example. There are plugins made in vimscript, yes, but lua lowered the entry barrier to writing plugins and experimenting so much that there has been a recent explosion in the number of plugins. If helix went a step further and allowed writing plugins in any language and still get the same perfomance (considering the wasm overhead of course), that'd be a game changer.

  • Rust might not be the best language to write plugins in:

    Scripting languages are better suited for tasks like this, and locking it into a language has a steep learning curve might not be the best decision.

  • Recompiling the config on every single change would be annoying:

    st for example is a very small program, and recompiling it doesn't take much time. Rust already has slow(er) compilation times so it would hinder fast experimentation.

sudormrfbin avatar Jun 19 '21 11:06 sudormrfbin

Scripting languages are better suited for tasks like this, and locking it into a language has a steep learning curve might not be the best decision.

Most scripting languages rely on a garbage collector. Does Wasm support this? How should we handle objects created on the GC side and passed to Rust? The plugin author would have to tell the GC to forget about objects, which is neither user-friendly nor particularly safe.

If helix went a step further and allowed writing plugins in any language and still get the same perfomance (considering the wasm overhead of course), that'd be a game changer.

If helix allowed writing plugins in any language, then the plugin API could at most use features that are available in any language and are also available in the Wasm ABI. Does Wasm support closures? Dynamic dispatch? (Genuine questions, I don't really know much about Wasm.) I think supporting every language around will make the plugin API very limiting.

My main concern is the amount of glue code required to make Wasm plugins (let alone non-Rust ones) work. I would say that Rust is a pretty mainstream and widely used language nowadays, so I wouldn't worry about the user-friendliness of Rust too much. On the other hand, setting up a Wasm build environment for a language that doesn't have great support for it often nontrivial and might put off plugin authors.

malte-v avatar Jun 19 '21 12:06 malte-v

Regarding garbage collection, currently some scripting languages implement it manually like AssemblyScript, and yes there is a proposal for it. Obviously, we should aim for user friendliness a minimum here. However I don't think what you cited will be issues, plugins authors will not have to tell the GC to forget about objects.

If helix allowed writing plugins in any language, then the plugin API could at most use features that are available in any language and are also available in the Wasm ABI. Does Wasm support closures? Dynamic dispatch? (Genuine questions, I don't really know much about Wasm.) I think supporting every language around will make the plugin API very limiting.

Obviously API will be limited to what WASI allows us to do. We know that, it's okay. It's a tradeoff.

My main concern is the amount of glue code required to make Wasm plugins (let alone non-Rust ones) work. I would say that Rust is a pretty mainstream and widely used language nowadays, so I wouldn't worry about the user-friendliness of Rust too much. On the other hand, setting up a Wasm build environment for a language that doesn't have great support for it often nontrivial and might put off plugin authors.

Glue code can be part of a library. We'll probably provide Rust one and a few others. Then yeah, you're right it'll probably be harder for a language with bad wasm support, but… should we care about that? I don't think so. Languages with good wasm support are numerous and in a growing number anyway.

CBenoit avatar Jun 19 '21 14:06 CBenoit