as-wasi icon indicating copy to clipboard operation
as-wasi copied to clipboard

Implementing "libpreopen" functionality

Open MarkMcCaskey opened this issue 5 years ago • 4 comments

Hello! I've run a lot of a AssemblyScript Wasm modules while working on Wasmer's WASI implementation and noticed that AssemblyScript doesn't seem to do the same kind of filesystem setup work that Rust and C do in regards to preopened directories. This manifests as AssemblyScript Wasm modules behaving incorrectly (well, at least differently from Rust and C) when dealing with multiple preopened directories and in some cases with relative paths.

I have a lot of context on the WASI filesystem and would be happy to make a PR implementing that logic!

I haven't had a chance to take a good look at the code yet, but part of what we'll need is to execute logic in _start, before main, if we can't do that, then that's something else that we'll need to implement.

It looks like the primary thing to update is here. As part of this, I believe we'll need to do prefix search. I'll start with a linear string comparison based search which can be turned into iterative hashing and then a proper data structure like a prefix tree. Given the number of pre open directories in cases I've seen, linear search or iterative substring hashing (at the file component level) are probably the right trade off in terms of binary size vs speed here.

So it really should just come down to:

  • [ ] Hook into _start
  • [ ] Enumerate the files and prestat them to build up some global data structure
  • [ ] Implement dirfdForPath (finding the right fd with the data structure from the step above)

MarkMcCaskey avatar Jan 16 '20 00:01 MarkMcCaskey

Yoooo @MarkMcCaskey Whaddup haha! :smile:

So I also ran into this recently, when trying to support multiple preopen directories (I think this is what this issue is refferring to?). And I found out how this is done in wasi libc:

https://github.com/WebAssembly/wasi-libc/blob/7b92f334e69c60a1d1c5d3e289790d790b9a185b/libc-bottom-half/libpreopen/libpreopen.c#L550

The idea is: Start at file descriptor 3, loop until you get an error, and for every iteration, get the path to the file descriptor :smile:

That way you can find all the preopened directories, and then from there like you were saying, do some prefixing and things to find which file descriptor maps to each path that is then trying to be opened :smile:

Honestly, I think this logic can be done lazily / dynamically in: https://github.com/jedisct1/as-wasi/blob/master/assembly/as-wasi.ts#L771

This isn't a feature that is super pressing to me right now, but that's all I think you would need to do to get a working implementation. I can probably do this when I find the time, or you @jedisct1 or me can do it, whoever gets to it first really haha :joy:

torch2424 avatar May 22 '20 22:05 torch2424

That can definitely be implemented. But symbolic links is what makes this way more complicated than just matching prefixes.

That being said... honestly... has anyone ever used more than one preopened directory in practice?

I feel like all the actual deployments just provide a base directory for everything.

jedisct1 avatar May 22 '20 22:05 jedisct1

But symbolic links is what makes this way more complicated than just matching prefixes.

Ahhhhh thanks for the heads up! I didn't even consider symlinks :upside_down_face: Do you know what the general hostcall order looks like for that? :thinking:

That being said... honestly... has anyone ever used more than one preopened directory in practice?

I've seen it happen on more than a couple occasions. Like, I'd even dare to say 20% of the time they will use more than one.

I dont think this isn't high priority really, but at least we have a somewhat solution for now for anyone that comes across this :smile:

torch2424 avatar May 22 '20 22:05 torch2424

Oh! For this issue, I figured out how to do it I think.

We need to re-implement this function: https://github.com/WebAssembly/wasi-libc/blob/7b92f334e69c60a1d1c5d3e289790d790b9a185b/libc-bottom-half/libpreopen/libpreopen.c#L488

(if not that function, the general path resolution is done with a combination of functions in that specific file :smile: )

This was shared with me by @pchickey

This is how wasi libc finds paths :smile:

torch2424 avatar Aug 20 '20 22:08 torch2424