Implement a parser for profile syntax in wit-bindgen
Would love to help with this if needed!
Hey @alexcrichton !
I've sketched out a rudimentary BNF grammar for profile syntax:
Profile ::= { Extend | Import | Export } .
Extend ::= `extend` string .
Import ::= `import` ident ":" "{" ("*" | "+") ":" string "}" .
Export ::= `export` ident ":" "{" ("*" | "+") ":" string "}" .
Where the operators of the grammar are defined as:
{}is repetition (0 to n)()grouping|alternation
So a profile is a collection of zero-or-more Extend, Import, or Export items.
Prototype Implementation
Assuming we want to add support for this in the wit-parser crate, the following is a highlevel overview of the work to be done.
Introduce new tokens to lexer (i.e. extend, import, export, and +).
Introduce new ASTs for Import, Export, Extend, and top-level type for Profile:
pub struct ProfileAst<'a> {
pub items: Vec<ProfileItem<'a>>,
}
pub enum ProfileItem<'a> {
Import(Import<'a>),
Export(Export<'a>),
Extend(Extend<'a>),
}
pub struct Import<'a> {
docs: Docs<'a>,
name: Id<'a>,
kind: ImportKind,
path: &'a str,
}
enum ImportKind {
ZeroOrMore,
OneOrMore,
}
pub struct Export<'a> {
docs: Docs<'a>,
name: Id<'a>,
kind: ExportKind,
path: &'a str,
}
enum ExportKind {
ZeroOrMore,
OneOrMore,
}
pub struct Extend<'a> {
docs: Docs<'a>,
path: &'a str,
}
Omitting the implementations of parse here for brevity.
Also, I'll update the function unwrap_md to accept an extension parameter (i.e. wit | world) to parse from markdown.
Tests
I was thinking it makes sense to segment the tests into two directories for wit and world, i.e. tests/ui/wit/... and tests/ui/world/... but I'm happy to do whatever you prefer.
If this overview/approach makes sense, I'm happy to start prototyping and PR incrementally to the wit-parser crate!
Also, I recognize this is the very first step to adding support. Next steps would include a validate/resolve/expand pass that lowers the AST into a top-level Profile struct living in lib.rs. But the mechanics of this "lowering" is not quite clear to me yet.
I think what you've sketched out seems reasonable but I'm not really the best person to lead and/or review such an implementation. I don't have a broader vision for how *.world files would otherwise fit into the wit-bindgen tooling.
Personally I'm wary of tacking on something after-the-fact with the intention that it will eventually be better integrated, I'd ideally prefer to have a vision for how this will all fit together eventually. I don't have such a vision yet, myself.
Would a good next step be to write up a more-complete world.md in the style of the current wit.md?
While that would probably help a little I'm thinking of the broader picture of how everything integrates into the tooling/bindings for each language. To me it's pretty clear how to design the parser and tweaking the syntax isn't really all that hard, the more difficult aspect is plumbing this through to each generator, updating the UX for existing generators, etc. I don't know how all that's going to work out myself.
Note that the syntax design is underway in the component model repo now.
Current discussion and design: https://github.com/WebAssembly/component-model/pull/83
This is now largely done with the world file parser added recently, and this will continue to evolve in short order. Further issue are best opened on the wasm-tools repo from now on as that's the new home for the wit-parser crate.