helix
helix copied to clipboard
WIP: Webassembly based plugin system
This is a work in-progress implementation of a plugin system based off the wasmer runtime. What has been done so far:
- [X] - Automatically loads and starts any '.wasm' file located in '~/.config/helix/plugins' in a Wasmer engine.
- [X] - Call plugin functions "on_start", "on_key_press", "on_mouse_event" and "on_resize" for related events
- [X] - Expose "helix::log" function to plugins which can be used to log text
Remaining to be done:
- [ ] - Export function to plugins for requesting some action be taken in helix.
- [ ] - Add support for more event types (e.g. current mode changed?)
There is an example plugin implementation here. It currently can only receive events and prints the results.
I wish you reached out before starting this, I have two concerns:
-
We've settled on wasmtime over wasmer. I have a wasmtime branch that implements most of this PR and would have been a good starting point.
-
Serializing to an intermediary format (protobuf) is wasteful and only slightly better than just using JSON. We want to closely integrate with the memory format of the VM to get good performance, if we're ser/de-ing on both sides we might as well drop WASM and run plugins as external RPC binaries similar to LSPs.
The problem here is exposing large documents: Imagine a plugin is trying to read the entire document to do some analysis, if the document is very large (100MB) we'd have to use up a lot of resources to entirely serialize it, then deserialize inside the plugin.
I wish you reached out before starting this, I have two concerns:
* We've settled on wasmtime over wasmer. I have a wasmtime branch that implements most of this PR and would have been a good starting point. * Serializing to an intermediary format (protobuf) is wasteful and only slightly better than just using JSON. We want to closely integrate with the memory format of the VM to get good performance, if we're ser/de-ing on both sides we might as well drop WASM and run plugins as external RPC binaries similar to LSPs. The problem here is exposing large documents: Imagine a plugin is trying to read the entire document to do some analysis, if the document is very large (100MB) we'd have to use up a lot of resources to entirely serialize it, then deserialize inside the plugin.
No problem, I've switched the code over to using wasmtime and will remove the usage of protobuf soon
Here's a demo of the progress so far along with an explanation of what is happening:
https://user-images.githubusercontent.com/11895736/177974200-f5ca38ea-e36a-44ce-84bd-9374b8ffa78a.mp4
- In the bottom window, a call to tail the
helix.log
file is used to see the live logs - In the top window, a debug build of helix is started in verbose mode
- When
hx
is started all of the.wasm
files located under~/.config/helix/plugins
are loaded. - A new 'wasmtime` runtime is created for each loaded plugin.
- A function
helix::log
is exposed to each plugin which can be used by the plugin to pass log messages back to the host (i.e. helix). - When each plugin is started the
on_start
function for the plugin, if one exists, is called - When crossterm events are generated, such as key press, resize window or mouse events, the corresponding
on_key_press
,on_resize
oron_mouse_event
functions of the plugin are called. - The plugin implementation can be seen here.
Right now the plugin just calls
helix::log
whenever it is called with any of the previously described functions.
Thank you for moving this forward.
As mentioned by Blaž, maybe reaching beforehand would have been nice just to be on the same page. He also mentioned both the biggest concerns I had with this PR (wasm engine and protobuf) so I’ll not go over it again.
I see you commented on the already existing MVP PR. Make sure to check what was done there, and what you can take from the initial attempt. Especially, I wrote a few intents in one of the new crate’s README.
Most notably, have a look at .witx
files (this was not yet in my PR, but this is basically how we want to describe helix plugins API).
I’ll try to have a more in-depth look later!
@CBenoit the current state of this PR is that it isn't using any codegen but that means I've had to manually write glue code for passing data back and forth (i.e. allocate, deallocate functions and pass pointer back and forth).
I was reading about witx and witx-bindgen I noticed that WASI is transitioning away from witx and to a wit format. Do you think it would make sense to adopt the wit
format for generating bindings rather than the witx
format?
I do not recommend either wit or witx. Interface types have been completely deprecated for a new proposal, see here: https://github.com/WebAssembly/interface-types
I think the ABI itself warrants more discussion and we shouldn't rush the implementation. I need to write up a larger note in the issue but I'm currently considering dropping WASM altogether for an embedded scheme implementation. I've experimented both with Chibi Scheme as well as currently working on my own bytecode VM so it better integrates with Rust.
There are other lisps, could you elaborate why do you choose chibi?
Scheme va Common Lisp: Scheme's specification is much shorter than Common Lisp.
Both Guile and Chibi were considered because they implement the r7rs spec and are intended for embedding. Guile would be preferred but there are no complete bindings for it in Rust and I've seen that there's some issues with it breaking rust's destructors (https://github.com/ysimonson/guile-sys).
Chibi itself is very small (<10k lines?) and intended to use as an extension language.
From my experiments there could be space for a rust based implementation that provides a nicer ABI to bind rust code. I took a look at https://github.com/mattwparas/steel but I'm currently attempting my own r7rs implementation.
How about janet? There is janetrs binding for rust(I didn't try it), it looks closer to Clojure, and it works like Lua/Python.
I want to summarize some of my findings learnt while getting to this point in this PR in case it aids in the conversation regarding the plugin system implementation for helix:
Challenge: Passing complex types between host (helix) and guest (plugin)
The technique used in this PR is essentially raw memory manipulation:
- Plugins export "allocate" and "deallocate" functions (example here)
- Helix manually allocate memory in the plugin for string arguments, copy string bytes into plugin memory and the call plugin functions providing string arguments as a pointer and length (example here)
- Plugin must then decode the string from raw bytes (example here)
From the perspective of a plugin author, having to manually define "allocate" and "deallocate" functions, and dealing with arguments as pointers is far from ideal.
The problem of passing complex types could be lessened using a codegen tool such as "wit-bindgen" but this tool is based on an webassembly "interface types" proposal which seems to be deprecated in favor of a new proposal for "component-model" for which there is currently no tooling available.
I cannot see a good way to solve this problem other than writing macros / libs for each plugin language target to hide some of this nastiness or waiting until wasm / wasi is more mature and there are tools/support to solve the problem.
Challenge: Performance
This is just a consideration based on a comment in this PR made by @archseer but in the case where a large document is being edited the performance of a wasm runtime based solution is almost certainly not going to be as good as a plugin system involving an embedded language.
This is because the wasm plugin runs in an isolated runtime and cannot directly access host memory. Meaning if the plugin wanted to read the entire document it would need to be copied into the plugin memory.
Challenge: WASI support
As pointed out by @coderedart in this comment
Just trying to point out that we are years away from this utopia of "use wasm to support all languages". i have seen a lot of comments argue for wasm using "plugin authors can use any language they want" merit which is just a dream for now.
When I look for languages that support WASM / WASI it's relatively easy to find examples of languages that support embedding a WASM runtime but hard to find examples of languages, other than C and rust, that can be compiled to a ".wasm" file that can then be consumed by a WASM runtime.
Summary
I love the idea of a wasm based plugin system where plugins can be written in any language but through the work in this PR I have come to believe that WASM and WASI are not yet at the maturity required to make that dream a reality.
I look forward to what will be possible in hopefully the not too distant future but in the interim an embedded language solution such as being suggested by @archseer seems like a better path forward.
I, for one, am very supportive of Scheme as an extension language. It's an incredibly small language that's easy to pick up, and it has some key features that I think are critical for an extension language:
- Macros (i.e. syntactic extension mechanisms)
- It's an actual language.
- Simple and easily embedded (you already know this, re: Guile and Chibi)
Macros are a necessity in a configuration/extension language (well really my personal position extends this to any language I use). I've used Vim in the past and been a heavy Emacs user for more than a decade. I've been giving Neovim and Helix a shot lately. While Lua is a huge improvement over Vimscript, the amount of boilerplate that's necessary for some pretty straightforward things is astounding. The excellent plugin LuaSnip (a snippet plugin) is a great example.
LuaSnip basically removes the shackles of most snippet plugins by allowing you to write snippets using an API that's just Lua functions. It's incredibly powerful, but it's also a bear to actually define snippets. You basically have two options: write the snippet directly using the function API which is essentially writing what should be an intermediate representation by hand or using the "format" function which is essentially writing everything inside of an interpolated string. Neither option is great.
The nice thing about Lua though is that Fennel is a Lisp that compiles to Lua, so you can kind of have your cake and eat it too. You might have to wade through the traces that come from the compiled Lua though.
I know earlier in this discussion (maybe in another thread) some people expressed hesitancy surrounding using a full-blown programming language as the extension format. I'm glad that seems to be fading into the background. In my opinion it's better to acknowledge that you can't foresee what your users will want or need. Giving them the power to choose rather than deciding what they should or should not do with something or dictating the "right way" is always a mistake retrospectively.
Simple and easily embedded speak for themselves.
FYI https://github.com/bytecodealliance/cargo-component exists as an early implementation of the component model
@JHonaker well put. +1 for this approach.
Is there any chances for existing vscode extensions to work helix through some kind adapter
@KalinDimitrow I don't think that would be more practical than a rewrite.
@KalinDimitrow I don't think that would be more practical than a rewrite.
For most cases yes, but the close source one like tab9, copilot or gitkraken there won't be any alternative. Anyway those are not essential. I'm just curious if adaptor pattern can be used here
Is there any chances for existing vscode extensions to work helix through some kind adapter
This won't happen. We'd have to copy the vscode extension API 1:1 which would be a massive endeavor.
I do not recommend either wit or witx. Interface types have been completely deprecated for a new proposal, see here: https://github.com/WebAssembly/interface-types
I think the ABI itself warrants more discussion and we shouldn't rush the implementation. I need to write up a larger note in the issue but I'm currently considering dropping WASM altogether for an embedded scheme implementation. I've experimented both with Chibi Scheme as well as currently working on my own bytecode VM so it better integrates with Rust.
Please also check s7 scheme. It's small, but consistent and powerful, with a bunch of great convenience features that make it a real joy to work with. I heard it's very simple to embed, too.
Closing this PR for now
If you want to continue the discussion please do so in #3806