component-model WIT Syntax: `world`

This PR sketches out the initial format for WebAssembly Worlds.

Signed-off-by: Brian H [email protected]

Aug 04 '22 18:08 fibonacci1729

I've talked with @fibonacci1729 a bit about this but I wanted to write something down here as well to make it more broadly known. Personally I think it would be best to not have what feels like "yet another text format" with the component model and figure out a way to fit this into existing constructs, e.g. *.wit files. There are two major downsides I feel adding a *.world syntax brings:

Primarily it feels like "yet another syntax" to learn when dealing with the component model. The component model is already very large and takes quite some time to boot up context on, and tacking another step onto the path-to-learning it feels like one that would be best avoided. The complexity of what's being added here needs to live somewhere, of course, but at least to me it feels bigger in a sense if it's in a new text syntax with a new extension and a new kind of file.
I personally feel like *.world is "weird" in that it must reference *.wit files and there's no way to "inline" everything into one file. This forces the organization of files or separate blobs where while that's at least conceptually what is desired for many repositories isn't the best for all uses. For example tests get verbose if it's strewn across many files and additionally it's not possible to "share a thing" to talk about a *.world, instead a "set of things" needs to be shared. (for example you can't gist a world to someone, you have to gist the world, the wit, and the relative folder structure of everything within).

To be clear my concerns are at the syntactic and organization layer. The capabilities and conceps expressed in *.world files I think are all fine, I'm hoping that there's a better way to express all of this.

I've been hesitant to write any of this down without any actual suggestion of something else to do, and while I have some small inspiration it still doesn't feel great. A rough idea of how we could shoehorn this is to merge the syntaxes of world and wit files and have both a top-level and and interface parsing context. For example:

// declare an "interface" in a top-level context
interface "wasi:types" {
    enum errno { ... }
}
 
// declare another interface
interface "wasi:http" {
    use { errno } from "wasi:type" // resolved from other `interface` declarations in the top-level

    // ...
}

interface "wasi:http/handler" {
    // ...
}

import backend: "wasi:http/handler"

// could support inline definitions of imports as well
import my-platform: {
    use { errno } from "wasi:types"

    my-api: func() -> result<u32, errno>
}

export "wasi:http/handler"

This solves the two issues I brought up by (a) merging the two syntaxes so there's only one "thing" to talk about, albeit the "thing" is the sum of the two prior sizes so it's only less complicated subjectively I think because there's not "world or wit" it's just all wit. And (b) with merged syntaxes it's possible to have everything inline in one file for tests/sharing/etc where it probably wouldn't be the idiomatic organization that everyone uses but I think would still be useful to have. My thinking is that the strings referring to other interfaces are looked up locally within the file first (e.g. prior interface "..." declarations) and if not otherwise specified it performs the resolution it would otherwise do today (e.g. query the filesystem, query the registry, etc).

I don't think this is the complete picture though because if you try to define "wasi:http/handler" in a dedicated *.wit file it would then have slightly different syntax than this file because the top-level is different. Given the structure the top-level is either the "world" context or the "interface" context and that's not necessarily known ahead of time. That sort of points back in the other direction of having two file extensions and two file formats. Personally, though, I feel like the component model already has enough if not too many concepts to learn so I would subjectively weigh the "merge concepts where we can" concern more heavily.

Sep 14 '22 14:09 alexcrichton

Great points and ideas. First, I definitely agree we should allow for having all-in-one files that contain all relevant definitions. Initially I had been thinking of starting with World files only containing references to Wit files via URL as an incremental baby step towards Worlds (which also matches what I think we'd want to standardize in a WASI context), but that does neglect the testing use case you brought up which does seem like a near-term necessity, so I think it makes sense to eagerly include inline definitions, including interface definitions like you're suggesting. With that addition, interfaces and worlds effectively share the same syntax; the only question is of the "top-level" grammar and file extension.

As for whether we should:

have a single .wit file syntax that can define both interfaces and worlds; or
separate .wit and .world files that can both define nested definitions but ultimately define a single, top-level interface or world, resp.,

I'm not sure yet. It seems like there are still some details to figure out for 1, as you pointed out so I'm interested to discuss this more.

One thing thinking about this has helped me realize is that I've been saying "wit" to mean "the syntax for defining an interface" mostly just because that's what it has meant for a while. But once you have two kinds of definitions, "interface" and "world", it does feel natural from first principles to have a single term (such as "wit") referring to "the whole IDL". So I think that is also a part of this choice between 1 and 2: whether "wit" is associated with "interfaces" or whether "wit" means "the whole IDL".

In any case, I do really like the term "world" meaning "a set of imports and exports". Previously we called it a "profile", but that doesn't seem to resonate with folks the way I initially thought it would; whereas "world" does (credit to Dan for suggesting the term). So whether it's in the filename extension and/or inline in the text as a keyword, I'd love to keep world.

Sep 14 '22 23:09 lukewagner

Thinking about this a bit more, the idea of saying that there is a single "Wit" syntax containing both interface and world definitions has grown on me. So, riffing on what Alex wrote above to try to answer the question he asked at the end: what if the rule was, if you have a file foo/bar.wit, it must contain one top-level definition named bar, so that when you name the file externally, you know which top-level definition you're referring to (and everything else in the file gets pulled in as-needed as supporting definitions).

So then I could write:

// foo/a.wit
interface a {
  resource r { ... }
}

// foo/b.wit
use { r } from "foo/a"
interface b {
  f: func(r: r) -> r
}

// foo/c.wit
world c {
  import b: "foo/b"
  export "foo/b"
}

OR, I could write:

// foo/c.wit
interface "foo/a" {
  resource r { ... }
}
use { r } from "foo/a"
interface "foo/b" {
  f: func(r: r) -> r
}
world c {
  import b: "foo/b"
  export "foo/b"
}

noting that the top-level use in the latter file could've also gone inside the world c { ... } to limit its scope and avoid collisions.

Earlier, I suggested removing the <id> : from imports and exports in worlds, with the reasoning being that worlds shouldn't define semantically-meaningful names (so that worlds can be purely structural sets that can be unioned and intersected by component producers and hosts). However, there is a separate structural, non-semantic use for the <id> :: answering the question "what should the bindings generator name this?" (just like record field labels). After considering some other namespace-y approaches that all seemed to have problems, it does seem like the world import/export is the best place, so I'd like to rescind that previous suggestion. (In the absence of an <id> (as in the export of world C above), it seems like Wit could default to a kebab-case name from the path (say, foo-b) to go into the actual generated component; more to discuss there.)

Lastly, the URLs written in my example only parse as URLs if there is a base URL supplied to the URL parser. It seems like it might be useful to allow this base URL to be an explicit parameter to tools like wit-bindgen so that the Wit files can be written as above in a base-URL-agnostic manner so that they can be, e.g., referenced both as standards with a wasi: scheme or published to a registry and referenced with an http: scheme.

Thoughts?

Sep 15 '22 23:09 lukewagner

I love the direction we are going with! @alexcrichton's concerns are valid and resonate with me. I like the idea that "wit" refers to "the whole IDL", while "world" refers a set of imports and exports.

When world keyword is introduced, I started to think if we are allowed to have multiple decalrations of worlds inline.

world a {
  import ...
  export ...
}

world b {
  import ...
  export ...
}

import ...
export ...

Note that the last two import and export are top-level world definition. The world { ... } scope is omitted because we can assume it's world's name is the same as the file name.

Furthermore, since the top level is a world, which contains inline sub-world definitions. This seems like we are allowed to have nested world declarations.

world a {
  world b { ... }
  import ...
  export ...
}

This starts to look very similar to the module system in programming languages like Rust.

Sep 16 '22 06:09 Mossaka

@Mossaka Great comments, thanks. Yes, I think we should be able to both have multiple world definitions in a single Wit (just like you can have multiple interface definitions) and have nesting of world and interface definitions (mirroring the type structure of componenttype and instancetype in components).

If we go with a single Wit format (with a single .wit extension), then I'd be hesitant to have a special "top-level" world that is written outside of any world { ... }, since now the grammar needs to "sniff" the contents to determine whether our top-level definition is a world or interface. That's why my previous examples suggested that these top-level definitions were enclosed in world { ... } or interface { ... } blocks. I also suggested we determine which definition is "top-level" by matching the file name, but this seems a bit fragile and like it could break down if paths/URLs get fancy. So perhaps instead the top-level definition is identified by a default prefix, as in:

// foo/a.wit
default interface a {
  resource r { ... }
}

or

// foo/c.wit
interface "foo/a" { ... }
interface "foo/b" { ... }
default world c {
  import b: "foo/b"
  export "foo/b"
}

? And agreed, it is rather module-system-ish :)

Sep 16 '22 16:09 lukewagner

How do "worlds" & "interfaces" relate to components?

During the 22 Sept. WASI meeting @fibonacci1729 provided a brief answer. However, I didn't quite understand. :sweat_smile: Would you mind explaining here?

Also, if I recall correctly, "worlds" are supposed to give an extra layer of abstraction over components. How so?

Sep 25 '22 12:09 badeend

Probably your comment is directed at including an answer in the document, not just the comments, but just to answer here briefly, from a component-model AST POV:

a world is IDL syntax for a componenttype
an interface is IDL syntax for an instancetype.

Thus, given any component .wasm, I should be able to automatically derive a Wit file containing a world that is the syntax for that .wasm's component type. See also my slightly longer suggestion of an explanation above.

Sep 26 '22 15:09 lukewagner

I updated the document to reflect the above world design discussion. Additionally, I created #115 to track discussion around introducing the interface keyword to WIT.

Let me know if I missed anything or didn't capture something correctly!

Oct 07 '22 14:10 fibonacci1729

@lukewagner That is the plan! I'm planning to push the PR for (1) today or tomorrow.

Oct 10 '22 16:10 fibonacci1729

One thing I think would be good to specify in the world definition here in this repository is how it maps onto a component type. That being said the existing *.wit doesn't do this, so I think it's ok to do in a follow-up PR, but I wanted to write some thoughts down about this while here.

In the wit-bindgen repository the wit-component tool has the closest concept to a "world" of anything so far, that concretely encompasses:

A world can have any number of named imported interfaces (current *.wit files) so long as the names don't conflict. Each imported interface maps to an imported component instance in the final component.
A world can have any number of exported interfaces so long as the names don't conflict and they all map to an exported component interface from the final component.
A world can also have a "default" interface which represents the items of the *.wit interface being exported from the root-level namespace (e.g. no wrapper component instance).

That at least is the current expressiveness of wit-bindgen's concept of a "world". It's still quite lacking internally since there's no sense of "sharing types" across all these interfaces, but that's what I'm hoping a future refactor will solve one day. For now though I think it might be good to have in our minds whether this is exactly what we want from world files or not. AFAIK there's been no formal discussion about how to map the entire component model onto *.wit or vice-versa, meaning that precisely how this mapping is done is a bit fuzzy and sort of only in a few folks heads at this time. The wit-component understanding of "worlds" doesn't map to all aspects of the component model (e.g. instances-of-instances-of-instances-of-functions can't be expressed, nor root-level function imports, nor component/module imports, etc). I think that's ok but I think it's best to be deliberate about all this.

Oct 11 '22 15:10 alexcrichton

Learning about that stuff at the moment, so I can only point out typos 🙄

Nov 01 '22 16:11 martinitus

This is probably out of scope for this design, but I'd like to discuss about worlds unions and intersections and how subtyping works in this regard. I am working on the wasi-kv-store proposal and started to use the new interface and world syntax to design key-value store interfaces.

I have the following design for a key-value store world, and a cloud/service world, which imports the former one.

world "wasi:cloud/services" {
  import kv: {*: "wasi:cloud/kv"}
  import mq: {*: "wasi:cloud/mq"}
  ...
  export http: "wasi:http/handler"
}

world "wasi:cloud/kv" {
  import kv: {*: "wasi:kv/data/crud"}
  
  export http: "wasi:http/handler"
}

I assume that wasi:cloud/kv is a subtype of wasi:cloud/services because the kv in cloud/services can be used in any contexts that cloud/services is expected.

Things become less clear to me is how world union works. For example, if I have a kv world that expects some advanced features like transaction andquery, and suppose I have interfaces for them. How could I constract a world/component from it and union with the crud interface that wasi:cloud/kv world imports?

As a side note: I notice that optional imports/exports are out of scope for components MVP, but I am thinking that the kv-store could be a great example of exposing a world with optional imports that capture advance use cases.

Nov 02 '22 17:11 Mossaka

@Mossaka Here's a first stab at an answer but maybe we should move any follow-up discussion to a new issue so we can dig in. If I understand your scenario correctly, the union of the simple and advanced kv-store worlds would offer both interfaces. Despite them both being logically "kv stores", the two interfaces would by default be as separate as any two other interfaces, and thus if a component imported both, it would have two instance imports (one implementing the simple kv-store interface, one implementing the advanced kv-store interface) each named by their distinct wasi: URLs, and any kv-store operations would be explicitly calls into one or the other. Of course a host or virtualization layer could choose to implement the union of kv-store operations with one instance (exporting the union of the two interface's fields) and supply the same instance for both imports (allowed by instance subtyping) as long as the two interfaces didn't have conflicting types for the exact same field name. Lastly, the advanced kv-store interface could use from the simple kv-store interface, so that any resource types defined in the simple kv-store interface could be used in the advanced kv-store interface, and this would tie the two together: when you imported the advanced kv-store interface, it would transitively pull in the simple kv-store interface (which then world unioning would "de-dupe").

Nov 02 '22 18:11 lukewagner

I was reading over this again just now, as of right now the differences with what's implemented in wit-parser are:

In worlds imports/exports of functions/types aren't supoprted, only interfaces are
In wit-parser there's additionally support for a default export foo in a world which is the "zero level" export which is sort of what imports/exports of types would be otherwise.

I wanted to write this down to pose the question how best to resolve this. The as-documented-right-now-in-this-PR semantics don't support the use case of "export directly from the component an interface defined somewhere else" which is what default export is solving. The usage of default export has its own downsides though of conflicting with default as used in use (possibly) and otherwise jsut feeling sort of bad.

Do others have ideas of how to resolve this?

Dec 02 '22 22:12 alexcrichton

Could we allow world imports and exports to have inline function (and later potentially other) types, e.g.:

world my-world {
  import fs: "wasi:filesystem"
  import foo: func(x: string) -> string
  export handler: "wasi:http/handler"
  export bar: func(x: string) -> string
  export baz: func() -> u32
}

with the compiled component type being:

(component
  (import "fs" "wasi:filesystem" (instance ...))
  (import "foo" (func (param "x" string) (result string)))
  (export "handler" "wasi:http/handler" (instance ...))
  (export "bar" (func (param "x" string) (result string)))
  (export "baz" (func (result u32)))
)

?

Dec 02 '22 22:12 lukewagner

That's what specified here in this PR believe, but the feature of default export in wit-parser is a holdover from wanting to do:

interface foo {
  foo: func()
}

world my-world {
  default export foo
}

where the component to generate is:

(component 
  (export "foo" (func))
)

where the interface foo is sort of splatted into the world directly so it's a bunch of exported functions rather than an exported interface.

Is this a feature worth supporting though? Should support for this just be dropped from wit-parser?

Dec 02 '22 23:12 alexcrichton

Oh, sorry, I misunderstood. I would guess "no" and if we decide we need it in the future, it would be some new keyword specifically for "splatting" interfaces.

Dec 02 '22 23:12 lukewagner

I may have missed or forgotten something, but is this otherwise ready to go? I created https://github.com/bytecodealliance/wasm-tools/issues/859 to remove support for default export from tooling which should bring the tooling up-to-date with this PR.

Dec 06 '22 19:12 alexcrichton

@alexcrichton I'll rebase this onto https://github.com/WebAssembly/component-model/pull/141, does that sound good?

Dec 06 '22 23:12 fibonacci1729

Happy to go either way myself!

Dec 06 '22 23:12 alexcrichton

SGTM! Let's merge this and handle the inconsistent deltas in your PR (which as far i can tell addresses most of them anyway).

Dec 06 '22 23:12 fibonacci1729