User-defined generic types (parametric polymorphism)
Doing a search in the github issues, I have yet to see a ticket where anyone has explicitly asked for this yet.
Apparently, there has been discussion on something similar to this, but from what I can tell this seems like a sufficiently different proposal to warrant a new issue.
Motivation / Example
Consider that you're writing a world to facilitate a generic API for building bindings to user interfaces from a WASM component. In addition to describing the layout tree (so basically a DOM, but more generic than that, as this could potentially target any sort of GUI framework with a hierarchy of layouts and sub-views), we also want to be able to describe how to bind data to the view.
For this framework, we want to be able to have "reactive values" as part of the API. This is different from a stream<T> because in addition to emitting updates to subscribers on changed values, it also gives you access to the "current" value. Something like:
resource reactive-value<T> {
current-value: func() -> T;
updates: stream<T>;
}
The issue with this of course being that WIT does not currently support user-defined generics. So currently we would have to monomorphize and provide implementations for any type T we want to use reactive-value with.
resource i32-reactive-value {
current-value: func() -> i32;
updates: stream<i32>;
}
resource f32-reactive-value {
current-value: func() -> f32;
updates: stream<f32>;
}
resource string-reactive-value {
current-value: func() -> string;
updates: stream<string>;
}
...
Not to mention, if a user wants to extend our API with different types of widgets in another WIT package (e.x. maybe an API for a date-time entry widget), they would also have to do this duplication themselves.
Sketch
To solve this problem, I propose updating the syntax of WIT to allow for custom generic types such as reactive-value<T> to be user-definable. For full generality, we would have to add generics to records, variants, resource types, and function types (though generic function types would obviously not be first-class).
Generic types can be used as you'd expect in packages / interfaces / worlds from the perspective of someone writing a wit package, but in the canonical ABI, all user-defined generic types and functions would be monomorphized, as this could be supported even in languages without parametric polymorphism.
For example:
interface widget-binding {
resource widget<T> {
update-value: func(T);
current-value: func() -> T;
}
// Potential syntax for a standalone generic function
bind-data: <T> func(widget: widget<T>, reactive-value: reactive-value<T>);
}
could be monomorphized into something like:
resource i32-widget {
update-value: func(i32);
current-value: func() -> i32;
}
resource f32-widget {
update-value: func(f32);
current-value: func() -> f32;
}
...
i32-bind-data: func(widget: widget<i32>, reactive-value: reactive-value<i32>);
f32-data: func(widget: widget<f32>, reactive-value: reactive-value<f32>);
...
depending on which specific type arguments are configured to be instantiated (similarly to this, this is something that could be configured in the user's build tools).
Note: If we want to allow nested generics (e.x. reactive-value<reactive-value<T>> -- which I think we should -- we'd have to come up with a different naming scheme for the monomorphized variants. This is just an example.
Additional Benefits
In addition to saving work for developers in not having to manually monomorphize when writing wit definitions, adding support for user-defined generic types / methods in the wit standard also gives third-party codegen tools more data to work with.
For instance, say someone wants to use WIT as a kind of language-agnostic modeling language to define an important data model for a polyglot project. The goal is to use the .wit definition as a single source of truth for the project's data model, which can then (with custom codegen tools) be transformed into idiomatic implementations of this data model in whatever language they wish.
For a language with built-in support for generic types (say Kotlin), something like widget-binding could be translated over fairly directly:
interface WidgetBinding {
interface Widget<T> {
fun updateValue(value: T)
fun currentValue(): T
}
fun <T> bindData(widget: Widget<T>, reactiveValue: ReactiveValue<T>)
}
whereas another language with no generic might resort to manually monomorphized variants (or perhaps even type-erased / dynamically typed variants!) in the generated model code, depending on what is the most idiomatic.
While the primary goal in this example is not multi-language interop, it's possible a codegen tool such as this one in this scenario could also generate some high-level glue code in order to translate between the language-specific idiomatic variants, and the monomorphized versions needed by the canonical ABI for interop between different WASM components (e.x. by type reflection, and delegating to the proper monomorphized call based on the type).
In total, I believe this feature would bring great value to the webassembly component ecosystem, as it allows for more expressive contracts between components (such as the aforementioned widget example) to be more easily expressed, and allows for at least some languages with support for parametric polymorphism to generate higher-level idiomatic bindings to a WIT API than otherwise possible.
Thanks for filing and the thoughtful writeup!
Just FWIW, another option in WIT today to address the kind of use case you're describing is to roll your own "any" type using resource or variant types. It's true that this will lose the static typing of the generics, but it might be good enough to get started (and way simpler than anything else we might consider adding below).
However, I agree we need to go beyond that and I also agree that we probably want to go the monomorphization route. I think a key design question is: do the generic types show up in concrete components (e.g., as represented by a .wasm file) such that given a .wasm, you can pick the Ts to apply to the component; or have the Ts already been picked and "baked in" by the time you get the component. If we go the monomorphization route, the answer is the latter (analogous to the binary you get from C++ or Rust). But this fact makes this feature categorically different from other WIT features because everything else in WIT (modulo syntactic sugar) maps directly to a component type in the concrete component .wasm (in fact, we regularly roundtrip (modulo lexical details) WIT to .wasm to WIT). It also means that now the producer toolchain is doing a lot more heavy lifting since it needs to generate different wasm code (interacting with different core wasm ABIs, but also avoiding the overhead of boxing/universal-representation that you otherwise need if you don't go the monomorphization route).
Also, iiuc, if you go the monomorphization route, the feature isn't technically "parametric polymorphism" or "generics" (like one has in, say, OCaml or C#). Thus, in #172 and casual discussion, we've been calling this feature "WIT templates". #172 looks superficially a lot different than what you're asking for here because it was attempting to carve out a small bite for some particular WASI use cases, but I think if one takes #172 farther, you need type variables too. However even #172 was too big once we started to dig into it (lots of runtime and producer toolchain implications), hence we've currently shelved the "WIT templates" feature until after 1.0.
Another realization working towards #172 is that, if we apply our "virtualization" principle to WIT templates (which is: every world should be implementable by a component; not just a native host) and ask what features a component would need to virtualize a templated world, I think the answer is: staged programming features. Particularly, a component supporting a templated world would need the ability to reflect on its given type arguments and then generate wasm code, all "Ahead of Time" (staged) so that the generated wasm code could be AOT-compiled to machine code and run like normal at runtime without overhead. If we did it right, the types introduced to describe these staged components' public interface would allow us to roundtrip WIT templates through .wasm, thereby defining the semantics of WIT templates the same way we define the rest of WIT. Thus, I think there is a natural complementarity between WIT templates and staged programming features which ideally we'd ensure by co-designing both features together (again, post-1.0).
Thus, I agree with the problem statement, but unfortunately I think this is a pretty big work item to do properly. But if you or anyone wants to work on prototyping it, that'd be awesome, and I'd be happy to help sketch.
Thank you for taking the time to read my proposal and offer your analysis!
I am glad to hear you agree with the problem statement, and I will have to think more about your comments. I am pretty new to WASM and still learning, but I have a few PLT projects under my belt, so I would definitely interested in contributing in the future.
I think a key design question is: do the generic types show up in concrete components (e.g., as represented by a .wasm file) such that given a .wasm, you can pick the Ts to apply to the component; or have the Ts already been picked and "baked in" by the time you get the component. If we go the monomorphization route, the answer is the latter (analogous to the binary you get from C++ or Rust). But this fact makes this feature categorically different from other WIT features because everything else in WIT (modulo syntactic sugar) maps directly to a component type in the concrete component .wasm (in fact, we regularly roundtrip (modulo lexical details) WIT to .wasm to WIT).
That is a really important point. I think obviously the former is much more powerful, whereas the latter (monomorphization) would be easier to implement. Again, I'm pretty new to all this stuff and I'm still trying to wrap my head around things like canon lower, but based on some of the stuff I've seen from this talk about how the Canonical ABI can change its representation for things based on whether the client / host component is sync / async I'm wondering if we could do something similar for generics.
e.x. a language with monomorphizing generics (like Rust or C++) can interop with another language with monomorphizing generics using X strategy, but with a language with type-erased generics (like Java) and / or runtime type information (C#) with strategy Y or Z.
I think the former approach would also be really important for more complex examples to avoid combinatorial explosion. For instance, consider if we were dealing with a generic function of 2, 3, 4 or even more parameters. (Also using a made-up syntax here for higher-order functions scoped to the call's lifetime):
interface reactive-methods {
map: <a, b> func(x: reactive-state<a>, f: func(a) -> b) -> reactice-state<b>
combine-3: <a, b, c, d> func(x: reactive-state<a>, y: reactive-state<b>, z: reactive-state<c>, f: (a, b, c) -> d) -> reactive-state<d>
...
}
Just specifying "I want to brute-force monomorphize all of these generic parameters for all possible combinators of types I specify in my config" would get ugly fast.
If the importing component could somehow specify what concrete types are actually needed based on how the generic functions are used (as I think you're suggesting), that'd be much nicer.
But if you or anyone wants to work on prototyping it, that'd be awesome, and I'd be happy to help sketch.
Yeah, I actually was kind of thinking of prototyping this as a superset of the currently existing.wit standard (maybe calling it .wittier as a play on words) that could be transpiled back to regular .wit. But maybe eventually I could try to prototype an actual way of integrating this into the component model. I think I'd need to familiarize myself with the component model more a bit first.
wit-bindgen is currently missing Haskell and Kotlin/wasm binding generators, so I might try my hand at those first to help better wrap my head around the canonical ABI.
Yeah, agreed on the combinatorial blow-up issues with monomorphization. I could also imagine an orthogonal and complementary proper "generics" feature (distinct from "WIT templates") that used boxing/universal-representation to (less efficiently) allow a single compiled core module wrapped in a component work for any T. The alternative approach you mention is intriguing, and I probably don't fully have the picture in my head, but it sounds a bit like C#'s reified generics approach which, from what I hear, is rather complex.
Anyhow, it's great to hear that you're thinking of working on bindings generation and then doing some prototyping; I'm interested to hear what you find.