mio icon indicating copy to clipboard operation
mio copied to clipboard

add `wasm32-wasip2` support

Open dicej opened this issue 1 year ago β€’ 41 comments

This implementation currently uses a mix of POSIX-style APIs (provided by wasi-libc via the libc crate) and WASIp2-native APIs (provided by the wasi crate).

Alternatively, we could implement Selector using only POSIX APIs, e.g. poll(2). However, that would add an extra layer of abstraction to support and debug, as well as make it impossible to support polling wasi:io/poll/pollable objects which cannot be represented as POSIX file descriptors (e.g. timer events, DNS queries, HTTP requests, etc.).

Another approach would be to use only the WASIp2 APIs and bypass wasi-libc entirely. However, that would break interoperability with both Rust std and e.g. C libraries which expect to work with file descriptors.

Since wasi-libc does not yet provide a public API for converting between file descriptors and WASIp2 resource handles, we currently use a non-public API (see the netc module below) to do so. Once https://github.com/WebAssembly/wasi-libc/issues/542 is addressed, we'll be able to switch to a public API.

I've tested this end-to-end using https://github.com/dicej/wasi-sockets-tests, which includes smoke tests for mio, tokio, tokio-postgres, etc.

dicej avatar Oct 09 '24 15:10 dicej

I just realized this PR doesn't include support for accepting incoming connections -- only initiating outgoing ones. I forgot about the former since I don't personally have an urgent need to support it, but it could be added as a follow-up PR. To be clear: WASIp2 is fully capable of handling that case.

dicej avatar Oct 09 '24 17:10 dicej

Also, I have a last clarifying question: reading wasi-libc I assumed that all Pollable are abstracted as a Socket but this socket is a "fake" one in the sense that we only use it to convey the "readiness" of the underlying Pollable irrespectively of whether it is actually obtained through wasi:sockets.

But your comment:

I just realized this PR doesn't include support for accepting incoming connections -- only initiating outgoing ones. I forgot about the former since I don't personally have an urgent need to support it, but it could be added as a follow-up PR. To be clear: WASIp2 is fully capable of handling that case.

Actually speak of connections, so I am not sure my previous assumption was correct.

Anyway, great job πŸ’ͺ

raskyld avatar Oct 09 '24 19:10 raskyld

Also, I have a last clarifying question: reading wasi-libc I assumed that all Pollable are abstracted as a Socket but this socket is a "fake" one in the sense that we only use it to convey the "readiness" of the underlying Pollable irrespectively of whether it is actually obtained through wasi:sockets.

WASIp2 represents things like TCP and UDP sockets as resources (e.g. tcp-socket) which have methods for binding, listening, connecting, etc. Once a socket is connected, you can get its input-stream and output-stream, which are also resources. And each of those can be used to obtain a pollable representing read and write readiness, respectively.

Consequently, wasi-libc must keep track of up to six resource handles for each socket:

  • a tcp-socket or udp-socket, depending on the socket type
  • a pollable representing readiness of any in-progress bind, listen, connect, or accept operation
  • (if connected) an input-stream, an output-stream, and one pollable each for reading and writing

So you can think of a wasi-libc socket file descriptor as uniquely identifying a bundle of resource handles, the number and types of which depends on the state that socket is in.

Adding support for binding, listening and accepting WASIp2 sockets in mio would amount to adding match cases for e.g. tcp_socket_state_tag_t::TCP_SOCKET_STATE_UNBOUND, tcp_socket_state_tag_t::TCP_SOCKET_STATE_BOUND, and tcp_socket_state_tag_t::TCP_SOCKET_STATE_LISTENING and using the socket_pollable field of tcp_socket_t to await transitions to the next state.

Does that help? Happy to go into more detail if desired. Also, the wasi-sockets docs are quite thorough if you haven't perused them yet.

dicej avatar Oct 09 '24 19:10 dicej

BTW, this PR doesn't include UDP support either, again because that hasn't been a priority for me. Shouldn't be hard to add as a follow-up PR.

dicej avatar Oct 09 '24 19:10 dicej

Thanks a lot for the time you took answering me πŸ™ Your answer is super clear!

My question was more related to how pollable obtained from types not related to sockets are usable in mio context, for example, the streams you mentioned can also be obtained through the wasi:http world: https://github.com/WebAssembly/wasi-http/blob/main/wit/types.wit#L510

In this case, I suspect you still have the streams and the pollable but not the sockets resource handles. Could you still use those resource handles to Poll them in mio? IIUC, the philosophy of the crate is to build upon any event source independently of the platform primitive under of that but I may misunderstand it!

raskyld avatar Oct 09 '24 20:10 raskyld

My question was more related to how pollable obtained from types not related to sockets are usable in mio context, for example, the streams you mentioned can also be obtained through the wasi:http world: https://github.com/WebAssembly/wasi-http/blob/main/wit/types.wit#L510

Oh right, great question. Yeah, I think we'd need to add a new, WASIp2-only API for registering pollables that have no corresponding file descriptors (and likewise for tokio, presumably). This PR clearly doesn't include such a thing, but it's something we could add in another PR.

Another approach would be to add an API to wasi-libc that accepts an arbitrary pollable and allocates a file descriptor for it.

@badeend and @sunfishcode might have thoughts about this.

dicej avatar Oct 09 '24 20:10 dicej

Alright thanks for the answer! Reading https://github.com/WebAssembly/wasi-libc/issues/542 it seems to me that the question of allocating fd to pollable was among the initial options, I wonder though how would you implement stuff like fstat for those fd πŸ€” but let's not pollute your MR!

To get back to the main subject, I just remembered I wondered why you use atomic types and mutex in the PR since wasip2 is single-threaded? Is it because of the thread proposal ?

raskyld avatar Oct 09 '24 20:10 raskyld

To get back to the main subject, I just remembered I wondered why you use atomic types and mutex in the PR since wasip2 is single-threaded? Is it because of the thread proposal ?

Honestly, I just copied that from the existing WASIp1 implementation. I wrote this code almost a year ago and only came back to it yesterday, so it's not super fresh in my mind, but that part at least came straight from the existing code. I'm guessing the original code either needed it to make the compiler happy (e.g. make Selector Send and Sync) or for future-proofing. Arc and Mutex are roughly equivalent to Rc and RefCell on single-threaded Wasm, anyway, so there shouldn't be a performance penalty.

dicej avatar Oct 09 '24 21:10 dicej

After a good night of sleep, I realised the name of the PR should probably mention that it only adds support for established wasi:sockets so it's clear that we will need follow-up PRs for other pollable and for initiating connections.

WDYT?

raskyld avatar Oct 10 '24 12:10 raskyld

After a good night of sleep, I realised the name of the PR should probably mention that it only adds support for established wasi:sockets so it's clear that we will need follow-up PRs for other pollable and for initiating connections.

WDYT?

Makes sense; I'll add some TODO comments to the code and to the commit message.

dicej avatar Oct 10 '24 13:10 dicej

Can we split this up in multiple prs as this seems to do multiple things.

1. A minimal pr that adds support for v2, no adding of v2 functionality.

2. The selector rewrite (not sure why this is needed)

3. Any v2 additional we make, such as support for more API

Items 1 and 2 are intertwined. None of the WASIp1 APIs are available in WASIp2, so a new selector implementation is needed to target the new APIs. There is an adapter which implements (most of) the WASIp1 APIs in terms of their WASIp2 counterparts, but wasi-libc doesn't use the adapter for sockets -- it uses the WASIp2 APIs directly in order to access the much broader socket support that WASIp2 provides.

As I mentioned above, we do have a few different options for a WASIp2 selector implementation (use POSIX poll(2), use the WASIp2 APIs directly, or use a mix of POSIX file descriptors and direct WASIp2 APIs for maximum compatibility), but reusing the WASIp1 implementation is not one of those options.

Totally agreed that we can leave item 3 in your list for a later PR, though.

dicej avatar Oct 14 '24 17:10 dicej

Just to clarify, is this currently "waiting-on-review", to use the rust-lang parlance? Or, conversely, is further review currently blocked on any action from the author?

jeffparsons avatar Oct 25 '24 21:10 jeffparsons

@jeffparsons this is waiting on an action on https://github.com/tokio-rs/mio/pull/1836#pullrequestreview-2367086886. We currently don't have the capacity to review a pr of this size, so it needs to be split up.

Thomasdezeeuw avatar Oct 26 '24 10:10 Thomasdezeeuw

@jeffparsons this is waiting on an action on #1836 (review). We currently don't have the capacity to review a pr of this size, so it needs to be split up.

Thanks, @Thomasdezeeuw. I wonder if that was perhaps not clear to @dicej because his later comment suggests there's no natural decomposition of what's in this PR, which gave me the impression that he was waiting for feedback on his explanation.

Just floating this idea: would it be possible/acceptable to both author and reviewer to split this into at least:

  • just common plumbing (e.g. selector) without any actual socket APIs
  • outgoing socket APIs

Would that make review more manageable? Are there other axes this could be reasonably split on?

I'm just an observer, so sorry if this isn't helpful. Just trying to see if there's anything that could get this out of limbo.

πŸ’–

jeffparsons avatar Oct 27 '24 04:10 jeffparsons

@jeffparsons this is waiting on an action on #1836 (review). We currently don't have the capacity to review a pr of this size, so it needs to be split up.

Along the lines of what @jeffparsons suggested, I can definitely split this up if we're not concerned about delivering useful functionality to wasm32-wasip2 users right off the bat. For example, I could start with a minimal scaffolding for the new selector that compiles but with only todo!() in the function bodies, then reintroduce the rest of the implementation incrementally until we have something that's actually useful. How does that sound?

dicej avatar Oct 28 '24 14:10 dicej

@Thomasdezeeuw Does the proposal in https://github.com/tokio-rs/mio/pull/1836#issuecomment-2441708065 sound okay to you?

jeffparsons avatar Nov 07 '24 01:11 jeffparsons

Thanks for the feedback, @Thomasdezeeuw !

poll_oneoff seems to have been changed to poll, except that wasi now allocates for the result, which we do not want.

That's an artifact of how WASIp2's poll function is defined, along with the underlying ABI, which determines how the host allocates and returns a list<u32> from a function. Specifically, the host calls a cabi_realloc function exported by the guest to do the allocation.

It's possible to optimize that by pre-allocating an conservatively-sized buffer before calling poll, storing a pointer to it in a global, and then returning that pointer when the host calls cabi_realloc. Then we could reuse that buffer across Selector::select calls, eliminating any per-call allocation overhead. The downside is that we'd have to write that code by hand rather than use the auto-generated bindings from the wasi crate, but maybe it's worth it in this case?

Specific to this pr, the netc module with all the type definitions will not be accepted.

That's fair. Seems like libc would be a good home for those.

Looking at wasi-p2 to me it seems we can change poll_oneoff to poll and it should mostly work (though I would expect certain edge cases).

The main difference between the WASIp1 and WASIp2 APIs is that the latter doesn't have any types corresponding to p1's subscription and eventtype types, etc. In p1, those types allowed the caller to tell poll_oneoff which events were of interest (in the case of subcription) and poll_oneoff to tell the caller which events actually occurred (in the case of eventtype). In p2, we must create and pass a separate pollable handle to poll for each event of interest.

In the case of p2, the pollable type is opaque, so we need to remember which socket (or other poll-able thing) and operation (e.g. read, write, or accept) that pollable represents, which accounts for some of the differences between p1.rs and p2.rs. The other differences are related to the fact that p2 simply supports more things (e.g. making outgoing connections) than p1 did.

That said, I definitely hear what you're saying about not wanting to review or maintain a lot of new code. With that in mind, we could come at this from a completely different direction, per the code comment I put in p2.rs:

// Alternatively, we could implement Selector using only POSIX APIs, // e.g. poll(2). However, that would add an extra layer of abstraction to // support and debug, as well as make it impossible to support polling // wasi:io/poll/pollable objects which cannot be represented as POSIX file // descriptors (e.g. timer events, DNS queries, HTTP requests, etc.).

Although I am a bit concerned about the extra level of abstraction, simply reusing src/sys/unix/selector/poll.rs for WASIp2 certainly has some appeal. Then we'd let wasi-libc take care of the details of abstracting the WASIp2 interface in terms of the POSIX poll(2) interface (e.g. wrangling pollables and keeping track of what they correspond to). And despite what I said in that comment, we could still potentially support polling other poll-able things like HTTP requests using this approach.

@Thomasdezeeuw, what do you think of the following as a path forward?

  • I close this PR
  • Someone (maybe me or someone else if they beat me to it) tests to verify that src/sys/unix/selector/poll.rs works correctly on WASIp2 and opens a PR to use that instead of src/sys/wasi/mod.rs when building for wasm32-wasip2, plus whatever other minor changes are needed (e.g. the tweaks I made to src/net/tcp/stream.rs in this PR)

dicej avatar Nov 13 '24 19:11 dicej

What’s the path forward here? Submitting smaller PRs to make reviewing easier? Would love to see this land!

fbjork avatar Feb 01 '25 09:02 fbjork

Could somebody tell me what is needed to get this one merged? I could try to finish it up, but I'm not sure why the current PR is not good enough and how would a possible PR from me be something that could be merged to main.

We'd like to start using the async ecosystem in wasi, and there are runtimes such as wstd that already works. But for some reason the whole rust async ecosystem depends on Tokio... I wonder how back in the days we could've abstracted the runtimes so we wouldn't be in this situation now...

pimeys avatar Feb 13 '25 13:02 pimeys

@pimeys @fbjork I'll try to summarize the situation as I understand it (see the earlier comments for details):

@Thomasdezeeuw feels this is not the right approach and would prefer a smaller, easier to review PR (i.e. not just breaking this one up into smaller pieces, but smaller overall). To address that, I've proposed that we treat WASIp2 as "just another POSIX system" and use the generic POSIX poll(2)-based implementation in src/sys/unix/selector/poll.rs instead of a WASI-specific implementation, leaving the details of WASIp2 interop to wasi-libc. I expect that would require minimal changes to mio, amounting mostly to tweaking some #[cfg(...)] annotations here and there. Whether such a PR would be accepted is something only a maintainer can answer, however.

dicej avatar Feb 13 '25 14:02 dicej

@pimeys @fbjork I'll try to summarize the situation as I understand it (see the earlier comments for details):

@Thomasdezeeuw feels this is not the right approach and would prefer a smaller, easier to review PR (i.e. not just breaking this one up into smaller pieces, but smaller overall). To address that, I've proposed that we treat WASIp2 as "just another POSIX system" and use the generic POSIX poll(2)-based implementation in src/sys/unix/selector/poll.rs instead of a WASI-specific implementation, leaving the details of WASIp2 interop to wasi-libc. I expect that would require minimal changes to mio, amounting mostly to tweaking some #[cfg(...)] annotations here and there. Whether such a PR would be accepted is something only a maintainer can answer, however.

Right. So instead of implementing our own module, we just use the unix module? It should just work if we have the corresponding wit imported to wasmtime?

Btw, is there a way to run the tests in mio with wasm32-wasip2 target?

pimeys avatar Feb 13 '25 15:02 pimeys

Right. So instead of implementing our own module, we just use the unix module?

Correct.

It should just work if we have the corresponding wit imported to wasmtime?

I'm not sure what you mean by that. When you tell Cargo to target wasm32-wasip2, it will statically link in wasi-libc, which will reference the WASIp2 interface definitions (e.g. wasi-clocks, wasi-filesystem, wasi-sockets, etc.). The resulting component will contain everything wasmtime run needs to know to run it -- no additional WIT files required.

Btw, is there a way to run the tests in mio with wasm32-wasip2 target?

I haven't tried that, but it should be doable by using wasmtime run as the test runner. I've written some end-to-end tests here that exercise tokio, tokio-postgres, etc.

dicej avatar Feb 13 '25 15:02 dicej

@dicej how should the wakers be created in wasip2? There's some abstractions in the wasi crate e.g. for pollables that could be used here. But how does this fit in the context of mio? Surely none of the eventfd, kqueue nor pipe wakers can be used here, we need something for wasi that can be used together with the pollables?

pimeys avatar Feb 13 '25 17:02 pimeys

@dicej how should the wakers be created in wasip2? There's some abstractions in the wasi crate e.g. for pollables that could be used here. But how does this fit in the context of mio? Surely none of the eventfd, kqueue nor pipe wakers can be used here, we need something for wasi that can be used together with the pollables?

The idea presented here is to add an API to wasi-libc to allow pollables to be converted to file descriptors, making them usable with mio, poll(2), etc. We'd need to implement that to support polling arbitrary pollables via mio, but I think that should be straightforward.

dicej avatar Feb 13 '25 18:02 dicej

The idea https://github.com/WebAssembly/wasi-libc/issues/542 is to add an API to wasi-libc to allow pollables to be converted to file descriptors, making them usable with mio, poll(2), etc. We'd need to implement that to support polling arbitrary pollables via mio, but I think that should be straightforward.

Interesting. But we can already work without these wakers. Sorry for asking a bit noob questions, but does that mean our waker is woken immediately and we consume quite a bit more CPU until we get proper wakers for mio?

I got your Postgres example running, so clearly we can work without for now...

pimeys avatar Feb 13 '25 19:02 pimeys

Interesting. But we can already work without these wakers. Sorry for asking a bit noob questions, but does that mean our waker is woken immediately and we consume quite a bit more CPU until we get proper wakers for mio?

Sorry, I guess I don't understand the question. poll(2) will block until at least one of the file descriptors you pass it is ready (or a timeout occurs). There's no need to busy wait and waste CPU or anything like that.

What I'm proposing is that mio uses poll(2) for WASIp2, which should be only slightly less efficient (due to an extra layer of abstraction and the necessity of converting between file descriptors and pollables) than calling the native WASIp2 wasi:io/poll#poll function directly. I expect the overhead will be negligible.

Can you clarify what you mean by "our waker is woken immediately"? It should only be woken when the pollable is ready.

dicej avatar Feb 13 '25 19:02 dicej

Interesting. But we can already work without these wakers. Sorry for asking a bit noob questions, but does that mean our waker is woken immediately and we consume quite a bit more CPU until we get proper wakers for mio?

Sorry, I guess I don't understand the question. poll(2) will block until at least one of the file descriptors you pass it is ready (or a timeout occurs). There's no need to busy wait and waste CPU or anything like that.

What I'm proposing is that mio uses poll(2) for WASIp2, which should be only slightly less efficient (due to an extra layer of abstraction and the necessity of converting between file descriptors and pollables) than calling the native WASIp2 wasi:io/poll#poll function directly. I expect the overhead will be negligible.

Can you clarify what you mean by "our waker is woken immediately"? It should only be woken when the pollable is ready.

I mean if we do not implement or use any waker in mio. If we cannot use the file descriptor waker yet from the libc-wasi, should we expect some issues? How is it currently working in your branch, I don't see you enabling any of the mio's waker implementations.

pimeys avatar Feb 13 '25 19:02 pimeys

I mean if we do not implement or use any waker in mio. If we cannot use the file descriptor waker yet from the libc-wasi, should we expect some issues? How is it currently working in your branch, I don't see you enabling any of the mio's waker implementations.

Are you talking about https://docs.rs/mio/1.0.2/mio/struct.Waker.html? That's only relevant for multithreaded systems, and neither WASIp1 nor WASIp2 support multithreading.

dicej avatar Feb 13 '25 19:02 dicej

I mean if we do not implement or use any waker in mio. If we cannot use the file descriptor waker yet from the libc-wasi, should we expect some issues? How is it currently working in your branch, I don't see you enabling any of the mio's waker implementations.

Are you talking about https://docs.rs/mio/1.0.2/mio/struct.Waker.html? That's only relevant for multithreaded systems, and neither WASIp1 nor WASIp2 support multithreading.

Ah, now I understand what you're getting at. I see now that sys/unix/selector/poll.rs uses Waker internally and the existing implementations in sys/unix/waker aren't suitable. Sorry, I misunderstood your original question completely.

Yeah, if poll.rs fundamentally assumes multithreading then it won't be a good fit. In that case, we can still aim for a simpler implementation than this current PR has by using poll(2), but it would look different from poll.rs.

dicej avatar Feb 13 '25 21:02 dicej

I've been spending the whole day reading mio and tokio sources, understanding where all pieces fit. I'm thinking that writing another selector doesn't sound so good because we can do most of the poll(2) in wasip2 already. The only missing link is the missing eventfd syscall, meaning we cannot use a file descriptor for waking up the poll.

What I'm seeing here is two options:

  • Either there is some way of creating a waker to work together with the code in selector/poll.rs
  • Or we write a new selector code for wasi, utilizing whatever @dicej wrote in their p2.rs. But not taking in the wasi crate, just using what we have in libc.

This is annoyingly close what I have in my branch, but I might be in a dead end...

pimeys avatar Feb 14 '25 16:02 pimeys